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Abstract 

We show that an infinite weighted tree admits a bi-Lipschitz embedding into Hilbert space 
if and only if it does not contain arbitrarily large complete binary trees with uniformly bounded 
distortion. We also introduce a new metric invariant called Markov convexity, and show how it 
can be used to compute the Euclidean distortion of any metric tree up to universal factors. 

1 Introduction 

Given two metric spaces {X,dx), {Y,dY), and a mapping f : X ^ Y, we denote the Lipschitz 
constant of / by ||/||Lip. If / is injective then the (bi-Lipschitz) distortion of / is defined as 
dist(/) = ||/||Lip • 1 1 1 1 Lip- The smallest distortion with which X embeds into Y is denoted cy(X), 
i.e. cy{X) = inf{dist(/) : f : X ^ Y}. When Y = Lp for some p > 1 we use the shorter notation 
Cp{X) = CLp{X). The parameter C2{X) is known in the literature as the Euclidean distortion of X. 

The ubiquitous problem of embedding metric spaces into "simpler" spaces occurs in various 
aspects of functional analysis, Riemannian geometry, group theory, and computer science. In most 
cases low distortion embeddings are used to "simplify" a geometric object by representing it as a 
subset of a better understood geometry. In other cases, embeddings are used to characterize impor- 
tant invariants such as various notions of dimensionality in metric spaces, and superreflexivity, type 
and cotype of normed spaces. More recently, striking applications of bi-Lipschitz embeddings were 
found in computer science, where the information obtained from concrete geometric representations 
of finite spaces is used to design efficient approximation algorithms and data structures. 

The present paper is devoted to the study of the Euclidean (and Lp) distortion of trees. In what 
follows by a metric tree we mean the shortest path metric induced on the vertices of a weighted 
graph-theoretical tree T = {V^E). In fact, all of our results will hold true for arbitrary subsets 
of metric trees, which are characterized among all metric spaces by the well known four point 
condition: For every four points x,y,u,v two of the three numbers {d{x,y) + d{u,v), d{x,u) + 
d{y,v), d{x,v) + d{y,u)} are the same, and that number is at least as large as the third (see [TO]). 
But, because our statements are asymptotic in nature, this does not increase the generality of our 
results, since Gupta [16] proved that any finite subset of a metric tree is bi-Lipschitz equivalent to a 
metric tree with distortion at most 8. The M-tree corresponding to a tree T is the one-dimensional 
simplicial complex induced by T, i.e. the path metric obtained by replacing each edge in T by an 
interval whose length is the weight of the edge. The M-tree corresponding to T will be denoted 
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Tig. In what follows, when we refer to an M-tree we mean an R-tree corresponding to some metric 
tree. We will see later that for every metric tree T and every p > 1, Cp{T) has the same order of 
magnitude as Cp{T-^), so in most cases the distinction between metric trees and M-trees will not be 
important, though in a few instances we will need to distinguish the two notions. 

Let denote the complete binary tree of depth k (with unit edge weights). In a famous 
paper [6] Bourgain proved that the Euclidean distortion of is (-y/log /c) . Moreover, he showed 
that a Banach space Y is super reflexive (i.e. admits an equivalent uniformly convex norm) if and 
only if limfc_>oo cy (-Bfc) = oo. This remarkable characterization of a linear property of Banach spaces 
in terms of their metric structure sparked a considerable amount of work on problems of a similar 
flavor (see the introduction of [30] for more information on this topic). Among the corollaries of 
Bourgain's work is the following dichotomy: For a Banach space Y either CY{Bk) = 1 for all /c, 
or there exists a > such that cyiBk) = ((log/c)") (similar phenomena are known to hold in a 
few other cases — see [71 [30]). Moreover, Bourgain used his theorem to solve a question posed by 
Gromov, showing that the hyperbolic plane does not admit a bi-Lipschitz embedding into Hilbert 
space. Similar applications of Bourgain's theorem to prove that certain metric spaces do not embed 
into Hilbert space were obtained by Benjamini and Schramm [5] in the case of graphs with positive 
Cheeger constant, and by Leuzinger [21] in the case of certain Tits buildings. 

The bi-Lipschitz structure of trees has been studied extensively. Trees are the "building blocks" 
of hyperbolic geometry, and embeddings of certain non-positively curved spaces into products 
of trees are used in several contexts (see for example [HI [HI [3T\ I20|). Similar results (known as 
"probabilistic embeddings into trees") are a powerful tool in computer science (see for example [U 
[T2]). We refer to [lOl [271 [E] for other results on the Lipschitz structure of trees. In spite of these 
applications, and the vast amount of work on trees in the Lipschitz category, the following natural 
problem remained open: When does an infinite metric tree embed with finite distortion into Hilbert 
space? One of the main results of this paper is the following answer to this question. 

Theorem 1.1. Let T = {V,E) be an infinite metric tree. Then the following conditions are 
equivalent. 

1. C2(T) = oo. 

2. supfcgpj CT(Sfc) < oo. 

3. For every /c G N, criBk) = 1. 

In other words, a metric tree admits a bi-Lipschitz embedding into Hilbert space if and only if 
it does not bi-Lipschitzly contain arbitrarily large complete binary trees. Thus there is a unique 
obstruction to a tree being non-Euclidean. Similar "unique obstruction" results are known only 
in very few cases: As we mentioned above, Bourgain [6] proved that complete binary trees are 
the unique obstruction to a Banach space being superreflexive; Bourgain, Milman and Wolfson [7] 
showed that Hamming cubes are the unique obstruction to a metric space having non-trivial type; 
Mendel and Naor [30] showed that integer grids are the unique obstruction to a metric space 
having finite cotype; Thomassen [35] proved that certain transient graphs must contain transient 
trees, and Benjamini and Schramm [5] proved that a graph with positive Cheeger constant must 
contain a tree with positive Cheeger constant. Another result in the spirit of Theorem 11.11 is the 
tree Szemeredi theorem of Furstenberg and Weiss [14j : A subset of positive density in the infinite 
complete binary tree must contain arbitrarily large copies of complete binary trees. 



2 



It is not surprising that the Theorem II .11 is a "local" result, in the sense that it deals with finite 
subsets of the metric tree T. Indeed, it is well known that a metric space embeds into Hilbert space 
if and only if all of its finite subsets do. It is thus natural to expect characterizations in the spirit of 
Theorem 1 1.1 1 to be local. Let us say that a metric space X is finitely representable in a metric space 
Y if there exists a constant D > 1 such that for every finite subset F C X we have cy{F) < D 
(this is an obvious adaptation of standard terminology from Banach space theory). Thus, denoting 
by -Boo the infinite unweighted complete binary tree, Theorem 11.11 can be rephrased as follows: A 
metric tree T admits a bi-Lipschitz embedding into Hilbert space if and only if B^q is not finitely 
representable in T. The following section contains optimal quantitative versions of this result, and 
explains the ingredients of its proof. 

1.1 Markov convexity and quantitative bounds 

Quantitative bounds on the Euclidean distortion of trees were obtained in [241 [28| [25| [T7] . In par- 
ticular, Matousek proved in [28] that for any n-point metric tree T we have C2{T) = O (\/log log n) . 
This result cannot be improved due to Bourgain's lower bound for the complete binary tree. Gupta, 
Krauthgamer and Lee [T7j obtained upper bounds on the Euclidean distortion of trees in terms of 
their doubling constant; in particular, they showed that every doubling tree admits a bi-Lipschitz 
embedding into a finite-dimensional Euclidean space. We present a new simpler proof of this fact 
in Section [2.31 where we also recall the definition of the doubling constant. 

We shall now state an optimal quantitative version of Theorem 11.11 Given a metric space 
{X, dx), k £ N and c > 1 , we denote 



In what follows we write A ^ B to mean A = 0{B). If A ^ B and B ^ A then we write A ^ B. 
Theorem 1.2. Let T be an arbitrary metric tree. Then for every c > 1, 



The lower bound in Theorem 11.21 is simply Bourgain's lower bound, and is therefore optimal. 
Somewhat surprisingly, the upper bound in Theorem 11.21 cannot be improved. The construction of 
a family of trees exhibiting this, which we call the Cantor trees, is presented in Section [3. 3. 3[ 

It follows that in order to obtain tight bounds on the Euclidean distortion of a given metric tree 
T we need an invariant which is more refined than the size of the biggest binary tree contained in 
T. This is achieved via the following definition. Let {X^j^Q be a Markov chain on a state space 0. 
Given an integer k > we denote by {Xt{k)}^Q the process which equals Xt for time t < k, and 
evolves independently (with respect to the same transition probabilities) for time t > k. Observe 
that for A; < 0, Xt{k) and Xt evolve independently for all t > 0. 

Definition 1.3. Let {X,d) be a metric space and p > 0. We shall say that {X,d) is Markov p- 
convex with constant H if for every Markov chain {Xt}'^Q on a state space Q, and every / : — > X, 
we have for every m E N, 



^x(c) = max {A: G N : cx{Bk) < c} . 
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The least constant H above is called the Markov p- convexity constant of X, and is denoted Yip{X). 
We shall say that X is Markov p-convex ifIlp{X) < oo. 

To understand this notion, recall that the chains X^ and Xt{t — 2^) run together for the first 
t — 2^ steps, and then evolve independently for the remaining 2^^ steps. Thus the left hand side 
in ([1]) is measuring the sum, over many "dyadic scales" k £ {0, 1, 2, . . .} of the average of the pth 
power of the normalized "drift" of the chain in X with respect to scale k. We will say that X has 
non-trivial Markov convexity if X is Markov p-convex for some p < oo. We note that L2 is Markov 
2-convex. More generally, the name comes from the fact that if X is a Banach space which admits 
an equivalent uniformly convex norm whose modulus of convexity is of power type p, then X is 
also Markov p-convex. These results are proved in Section 13.11 

In Bourgain's paper [6] there is an implicit "non-linear" notion of uniform convexity related 
to the presence of complete binary trees. For the results in this paper, we require the above 
"Markov variant," analogous to Ball's notion of Markov type [2]. The search for Poincare-type 
inequalities on metric spaces which are analogs of classical Banach space invariants have been 
fruitfully investigated by several authors — we refer to the papers [HI [151 (Ml [2 [32l [311 [30] for a 
discussion of this research direction, to which Definition 11.31 belongs. The following theorem shows 
that Markov convexity determines the Euclidean distortion of a tree, up to universal factors. 

Theorem 1.4. Let T be a metric tree. Then C2(T) ~ n2(riK). 

Recall that Tjr denotes the M-tree corresponding to T. See Remark 13.31 for a discussion of why 
we have to pass to M-trees in Theorem 11.41 

We also obtain a combinatorial way to compute the Euclidean distortion of any tree. Let 
T = {V, E) be a metric tree, and let x '■ E ^ Ij he an edge coloring. We call % a monotone coloring 
if all of its color classes are paths contained in a root-leaf path (such paths are called monotone 
paths in what follows). For 5 S (0,1), the coloring x is called <5-strong if it is monotone and for 
every u, -y G F at least half of the length of the path joining u and v can be covered by color classes 
of length at least 5dT{u,v). We define 5*{T) to be the largest 6 for which T admits a (5-strong 
coloring. The following theorem shows that 6*{T) determines the Euclidean distortion of T. 

Theorem 1.5. Let T be a metric tree. Then 



The upper bound on C2(T) in Theorem 11.51 continues a theme that also appeared in [24 1 128 1 [T7]: 
Certain edge colorings can be used to construct embeddings into L2. Specifically, our proof draws 
on ideas from Matousek's embedding [28]. But, Matousek's argument requires colorings with a 
small number of colors, the existence of which depends only on the topology of T and does not 
take into account the edge lengths. Our argument for the upper bound, which is contained in 
Theorem 12. 61 builds on Matousek's proof while taking the metric into consideration, and is therefore 
more involved. 

The lower bound on C2{T) in Theorem 11.51 goes through Theorem 11.41 We construct a special 
coloring of T, and show that if the coloring is not 5-strong, then we can construct a Markov 
chain on T which wanders too quickly for T to have a small Markov 2-convexity constant. This is 
done by locating a special type of subtree of T, which we call a weak prototype — see Section 13.31 
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for the definition, where it is shown that weak prototypes cannot have good Markov convexity 
properties. This "reconstruction paradigm" is inspired by a result of [17] which shows that if a 
certain procedure fails to produce a good coloring, then the tree under consideration must have a 
large doubling constant. Our approach is able to pick out significantly more delicate sub-structures 
(e.g. embedded complete binary trees or the aforementioned "weak prototypes"). A key difficulty 
that arises in our setting involves choosing the "scale" at which the required weak prototype embeds 
into T. This "scale selection" argument is a central part of our proof of Theorem 11.21 Theorem [T31 
and Theorem 11.51 — we refer to Section [2.21 and Section [4] for the details. 

We remark that all of our results can be applied to compute the Lp distortion of trees. Namely, 
we show that for every p, c > 1 and every metric tree T, 



where the implied constants may depend only on p; see Theorem 14.11 

The use of Markov convexity as a metric invariant, and thus a tool for proving distortion lower 
bounds, is not limited to the case of trees. In Section [3.21 we investigate classes of spaces which can 
be shown not to embed into L2 by analyzing their Markov convexity. In particular, we prove a lower 
bound on the Euclidean distortion of balls of finitely generated groups (equipped with the word 
metric) which admit a bounded non-constant harmonic function. We also bound from below the 
Euclidean distortion of the lamplighter group over the cycle (see Section 13.21 for the definition) . In 
a future paper, which will be devoted to embeddings of the lamplighter group, we use the methods 
of [31] to show that this group has Markov type 2 in the sense of Ball [2]. Thus, Markov convexity 
is the only known invariant which demonstrates that this group does not well-embed into Hilbert 
space. 

Our results, specifically Theorem 11.51 have algorithmic implications. Given an n-point metric 
space X, the problem of efficiently computing its distortion in a class of metric spaces up to a small 
factor has attracted a lot of attention in recent years, and is known as the "relative embedding" 
problem. We refer to [I] and the references therein for a discussion of this topic, and also for 
some hardness results. The Euclidean distortion of an n-point metric space can be computed in 
polynomial time, since this problem can be cast as a semidefinite program [23]. Hence Theorem 11.51 
yields a polynomial time algorithm for estimating the parameter log {^-^^jx)) ^ constant factor 

for any tree T. In conjunction with ([3]), this gives a polynomial time algorithm which computes 
the Lp distortion of any tree up to a universal constant factor. Note that it not known whether the 
Lp distortion of a general finite metric space can be approximated efficiently. 

1.2 Some open problems 

We end this introduction by stating some interesting open problems that arise from our work. 

Problem 1. In Section [3.1 1 we show that every p-uniformly convex Banach space is Markov p- 
convex. We also show that if X is a Banach lattice which is Markov p-convex then it is also 




and 




(3) 
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g-uniformly convex for every q > p- The relation between Markov p-convexity and uniform p- 
convexity for general Banach spaces is unclear. 

Problem 2. One corollary of our results is that if a metric tree is not Markov p-convex for any 
p < oo then it contains arbitrarily large complete binary trees with distortion arbitrarily close to 1. 
It is possible that this holds true for arbitrary metric spaces, and not just metric trees. If this is the 
case, then it would correspond to known results in Banach space theory, and would complement 
the existing theory of metric type and cotype. 

Problem 3. It would be interesting to investigate other "unique obstruction" results of the type 
described here. In particular, can one classify the obstructions to a planar graph being embeddable 
in L2? Another interesting generalization would be to classify the subsets of — the hyperbolic 
plane — which embed into L2; it seems plausible that complete binary trees are the only obstruction 
in this case, just as for tree metrics. In a similar vein, it might be the case that the only subsets 
of a product of trees which do not admit a bi-Lipschitz embedding into L2 are those that contain 
arbitrarily large bi-Lipschitz copies of complete binary trees. If true, then in combination with the 
result of [8j, this would imply the same result for the hyperbolic plane. 

Problem 4. In Section 13.21 we give lower bounds on the Euclidean distortion of the lamplighter 
group over the n-cycle. We do not know what is the correct asymptotic behavior of this distortion. 
It is also unknown whether or not these groups embed into Li with uniformly bounded distortion. 




Theorems 1.4 + 1.5 



Figure 1: A schematic description of the implications between the sections in this paper. 
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2 Distortion bounds via the containment of binary trees 



The purpose of this section is to prove the following theorem which, when combined with Bourgain's 
lower bound for binary trees 0, implies Theorem 1 1 . 1 1 and Theorem ll.2i 

Theorem 2.1. Let T he an arbitrary metric tree and p > 1. Then for every c > 1, 



f c y 

Cp(r)<130f^-^T(c)j 



In Section [3.3.31 we will show that the asymptotic dependence on SSt{c) in the upper bound of 
Theorem 12.11 cannot be improved. 

2.1 Coloring based upper bounds 

We begin with some definitions and notation. Let T = (V, E) be a finite graph-theoretic tree with 
positive edge lengths i : E ^ (0, oo), and let (It be the induced path metric on V. We also fix 
some arbitrary root r £ T. A monotone path in T is a connected subset (also called a segment in 
what follows) of some root-leaf path. By an edge-coloring ofT, we mean a map x: E ^ 7L. We say 
that the coloring is monotone if for every m £ the color class is a monotone path. For 

u,v £ V we let P{u, v) E denote the unique path from u to v, and set P{v) = P{v, r). Given an 
edge coloring x : ^ Z, /c G x{E), and u,v £V , we write 

e(iP{u,v) 
X{e)=k 

We also set i\{v) = i^{v,r). Finally, given u,v G V we let lca.{u,v) denote the least common 
ancestor of u and v in T. 

Definition 2.2 (e-good coloring). We say that a coloring x ■ E ^ Z is e-good if it is monotone, 
and for every u,v £ T, the unique path from u to v contains a monochromatic segment of length 
at least e ■ dT{u,v). We define e*{T) to be the largest e for which T admits an e-good coloring. 

The following simple lemma will not be used in the proof of Theorem 12.11 but we include it 
since it illustrates the relation between colorings and embeddings, and it will be used eventually in 
Section I 



Lemma 2.3. For every weighted tree T and p > 1, 

2i/p 

CAT) < 

Proof. Fix £ < e*(T) and let x ■ -E' ^ ^ be an e-good coloring. Let {e^lfcez be the standard basis 
of £p = £p{Z). Define f : V ^ £p by 

f{v) = Y.q{v)e,. 
fcez 

(Recall that £fc(^^) is the distance that the segment colored k contributes to the path joining v to 
the root). 
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Fix u,v £ V and write w = lca(u, u). The fact that the coloring x is monotone imphes that 
x{P{u,w)) n x{P{v,w)) =0. Thus 

On the other hand, since x is e-good, there are a,6 G Z such that iaiujw) > £dT{u,w) and 
i^{v,w) > edT{v,w). It fohows that 

||/(n) - f{v)\\p > {[i^iu,w)r + [£^iv,wWf^ > ^[dTiu,w) + dTiv,w)] = ^/t{u,v). (4) 

□ 

To get tighter control on the Euclidean distortion of trees we introduce the notion of (5-strong 
colorings. 

Definition 2.4 (5-strong coloring). We say that a coloring x : -E — > Z is 6-strong if it is monotone, 
and for every u^v 

^^^(n,?;) • l{ii(u,v)>SdT{u,v)} > -dT{u,v). 

feGZ 

In words, we demand that at least half of the shortest path joining u and v is covered by color classes 
of length at least 5dT{u,v). As before, we define 6*{T) to be the largest 5 for which T admits an 
6-strong coloring. 

The notions of (5-strong colorings and e-good colorings are related via the following simple 
lemma. 

Lemma 2.5. Every weighted tree T satisfies 5*{T) > 2"'^^'^*^'^\ 

Proof. Let x be an e-good coloring of T. We will prove that it is also 2~^/^-strong. In fact, we 
shall show that for every a € (0, 1] and u,v £ V, the total length of the monochromatic segments 
of length at least adT{u,v) on the path P{u,v) satisfies 

E^fc^^'^) • ^{q(u,v)>adT{u,v)} ^ ( 1 - (7) ^ ) dT{u,v). (5) 
fcGZ V / 

Choosing a = 2-3/^ in (El), and using the fact that 2^/^ > -, we deduce that x is 2 '^-strong. The 
proof of dS]) is by induction on dT{u,v). If dT{u,v) is minimal then P(u,v) is an edge, and hence 
monochromatic, so that the assertion holds trivially. In general, since the coloring x is e-good, 
there are two vertices on the path P{u, v) such that the segment P{a, b) is monochromatic and 
dT{a,b) > edT{u,v). Without loss of generality we assume that dT{a,u) < dT{b,u). If e < a 
then there is nothing to prove, so assume that e > a. Denoting A = dT{u,a), B = dxih^v) and 
D = dT{u,v), we apply the inductive hypothesis to the paths joining u and a and b and v, to get 
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that 




(6) 

(7) 
(8) 

Where in ([B]) we used the concavity of the function t i— > in ([TD we used the fact that 

D = A+B+dria, b) > A+B+eD, and in ([8]) we used the elementary inequality 2''/'^{l-e)^~^/'^ < 1, 
which is valid for every e G [0, 1]. □ 

In [28], Matousek proves that if x is a monotone edge-coloring of T such that every root-leaf 
path contains at most M distinct color classes, then Cp{T) < O ^(log M)™"^^?'^-^^ . Clearly any 

such coloring is l/(2A/)-strong. The next theorem generalizes Matousek's result along these lines. 

We suggest that the reader skip this rather technical proof upon a first reading. In particular, 
the much simpler Lemma 12.31 suffices for the proof of Theorem II. H although it does not give the 
optimal quantitative bounds of Theorem 11.21 

Theorem 2.6. For every weighted tree T = {V, E) and p>l, 



Cp{T) < 4 



log 



6*iT) 



Proof. We may assume that p G [2, oo), since if p G [1, 2) the required result follows by embedding 
T into £2 ^ Lp. Fix 5 < mm{d*{T), 1/2} and let x : -S' — > ^ be a J-strong coloring. Let {ek}kez 
be as in Lemma [2.31 For v G V we denote by {ki{v),... ,km^{v)) the sequence of color classes 
encountered on the path from the root to v. We shall denote by dj{v) the distance that the color 
class kj{v) contributes to the path from the root to v, i.e. 

= E ^(^)- 

eeP(D) 
X{e)=kj{v) 



Now let 



= ^max< Q,dj{v) - - ^dh{v) \ , 

j=i K h=i ) 



and define f : V ^ ^p(^) by 



i=l 
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We will break the proof the proof of the fact that / satisfies the required distortion bound into 
several claims. 



Claim 2.7. For all v € V and j S {1, . . . , m„}, 

^ rrii, 

Si{v) >j^dj 



Proof. This is where the fact that x is a (5-strong coloring comes in. Indeed, 



my ( ^ j 1 



h=i 



2 - 4 

jG{i,...,m„} J=« 



I nil, 



□ 



Claim 2.8. ||/||Lip < [5 log (3/5)] Vp. 



Proof. We need to show that for every edge {u,v) G E, \\f{u) - f{v)\\p < W[log{l / 6)]^/p . Assume 
that V is further than u from the root of T. In this case ki(u) = ki(v), . . . , kmy[u) = km^^iv) and 
rriv G {ruu, m„ + 1}. If ruy = ruu + 1 then we denote for the sake of simplicity (u) = Sm^ (u) = 0. 
With this notation we have that 



1=1 
rriy — l 



+ 



i=l 



\[dmM\'''[^mAu)]^'~'^'' - [dmAvtHsmMt-'^'" 



Note that by our definitions, Sm^iu) = dm^iu) and Sm^iv) = dm^iv). Thus 

rriy — l 



\\f{u)-f{v)\\l = Y.^" 



i=l 
m„-l 



1)/P 



+ \dray{u) - dmAv)\^ 



< 



Y d^{v) [s,{u)](P-'yP-[s,{v)]^P-^yp\[dT{u,v)]P. 



i=l 



Observe that for all i G {1, . . . ,mi, — 1}, Si{v) > Si{u). Thus 



< 



\si{u) - Si{v)\ 



(9) 



(10) 



where we used the elementary inequality y°' — x°' < ^r^, which is valid for all y > x > and 
a E (0,1). 

Observe that for every i < m„ — 1, 

Si{v) - Si{u) = max <^ 0,dm„('y) - i^Y^''^'"^ ( ~ 1 0:^^m„(^i) - r - '^t{u,v). (11) 



h=i 



h=i 
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Thus, combining ([TO]) and ([TT]) we see that 



m.,j — 1 



i)/p 



i=l 



rriv — l 



< 



\si{u) - Si{v)\P 



1=1 



Si{v) 



< [dT{u,vW- E 



iG{l,...,m„-l} 



di{v) 

Si{v) 



< 4[dTiu,v)r- E 



di{v) 



ie{l,...,m^ — l} 



(12) 



where in the last line we used Claim [221 
Observe that for every xi, . . . , > 0, 



^ + Xi+i H h Xfc + 

Thus 



^ Jxi+iH haifc 



E 



..,1 ^.-o^?-"'*"' 



t + 1 



E 



t + 1 



log(xi H h a^fc + 1) 



di{v)/dm, (f) 



Si(M)7^Si(w) 



< loe 



1 



V 



ie{l,...,m„ — 1} 

Si(u)^Si('u) 



(13) 



Let i be the smallest integer in {1, . . . ,my — 1} such that Si{u) ^ Si{v). Then by the definition of 



dm A'") > i;^dhii 



h=i 



It follows that 



log 



1 + 



V 



dni^ (v) 



;e{l,.--,»n-i, — 1} . 



< log 



< log 1 + 



Plugging this bound into (fTS]) . and using (fT2]) and ([9|), we get that 



||/(n)-/(t;)||,< 



41og( 1 + - 1 +1 



i/p 



dT{u,v) < [5log{3/ 5)]^^P ■dT{u,v). 



□ 



Our final claim bounds ||/ ""^ULip- 
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C?l(M) = rfl(v) 




Figure 2: A schematic description of the location of u and v in the tree T. The bold segment 
corresponds to the color class kj{u) = kj{v). 

Claim 2.9. The embedding f is invertible, and ||/~^||Lip ^ 48. 

Proof. Fix u,v £ V, u ^ V, and let j be the integer satisfying ki{u) = ki{v), . . . , kj{u) = kj{v) 
and kj+i{u) / kj^i{v). It follows that di{u) = di{v), . . . ,dj-i{u) = dj^i{v), and we may assume 
without loss of generality that dj{u) > dj{v). With this notation (see Figure [2?T] below) . 



dT{u,v) = dj{u) - dj{v) + ^ di{u) + ^ di{v). 

i=j+l i=j+l 

On the other hand, 

||/(n) - f{vW^ > \[dj{u)]yP[s,{u)]''P-'yP - 

niu rriu 



(14) 



+ 



(15) 



i=j+i 



i=j+i 



Using Claim [2771 we see that 

TTlu -t TTT'U I < I I' LL 



i=j+l 



4P- 



i=j+l \h=i / 



> 



4p- 



4P- 



i=j + l J + i W-l 'rd.rnu {«) 

fdj + l{jl)H \-dmu (") 

t^-^dt 



1 / \ 



(16) 
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Similarly, 



i=j+l ^ \i=j+l J 



(17) 



We now consider two cases: 

2 — l^i=j+l ' 



Case 1. '^J '^J < "^"^-.^ di{v). In this case, using (fn|) we see that 



V=i+i 2=j+i 

< p4:P-^.3P-2P-'\\f{u)-f{vWp, 
where in the last inequality we plugged the bounds p6p and (I17p into (I15|) . Thus we get that 



as required. 

[u 

2 ^ Z^i=j+1 



||/(u)-/(t;)||p>^-dT(n,^;), 



Case 2. iMzM^ > Y.^r,^di{v). In this case we observe that 



(■(-u) = ^max j 0,di{u) - |- > (^1 - -J 



i=j h=j 

and similarly 

(ij(M) — dj{v) 



Thus 



/ x\ (p-l)/p 



> (1 " ' 



y djju) - dj{v) 
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where we used the fact that S < ^. Usmg (jlSp and the bounds (jl6p and (jl7p . it follows that 



i=j+l i=j+l 




p4p . 3P 

□ 

Claim 12. 9^ together with Claim 12.81 concludes the proof of Theorem 12.61 □ 

2.2 Relating coloring bounds to the containment of large binary trees 

The following theorem, in conjunction with Theorem 12.61 and Lemma l2.5| implies Theorem 12.11 If 
one is concerned with simply giving some upper bound on Cp(T^ in terms of ^^^^^(c), then it suffices 
to combine the following theorem with Lemma 12.31 

Theorem 2.10. For every weighted tree T = {V, E) and every c > 1, 

c- 1 1 



250c e*{T)' 



Proof. We start by introducing some notation. For a vertex v £ V we denote by 't^{v) the set of all 
children of v in T. Given u £ ^{v) we denote by Tu = {Vu, Ey) the subtree rooted at u. We also 
let Fu denote the tree F^ = iVu U {v}, E^ U {(t), u)}), i.e. F^ is plus the "incoming" edge {v, u). 

Recall that Bk = (V^, Ek) is the complete binary tree of height k. Let be the root of Bk, and 
define an auxiliary tree by = iVk U {mk}^Ek U {(rfifc,rfc)}) (i.e. Mk is Bk with an extra 
incoming edge). Given a connected subtree H oi T rooted at rn-, we shall say that H admits a 
copy of Mk at scale j if there exits a one-to-one mapping / : Mk H such that 

1- /("ifc) = ru 

2. ll/llLip < ^ • 4^' and ||/-i||Lip < (thus in particular dist(/) < c). 
We define 

fJ^j{H) = max{/c G N : H admits a copy of Mk at scale j} , 

or ^j{H) = — 1 if no such k exists. 

We shall now define a function ^i:!/— >ZU{cxo} and a coloring x '■ E ^ Z. These mappings 
will be constructed by induction as follows. We start by setting g(r) = oo. Assume inductively 
that the construction is done so that whenever v € V is such that g{v) is defined, if u is a vertex 
on the path P{v) then g{u) has already been defined, and for every edge e £ E incident with v, 
x{e) has been defined. 

Let V £ V he a vertex closest to the root r for which g{v) hasn't yet been defined. Then, by our 
assumption, for every e € P{v), x{^) has been defined, and for every vertex u other than v lying 
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on the path P{v), g{u) has been defined. Let |3^^{v) C V denote the set of breakpoints of x hi P{v)-, 
i.e. the set of vertices on P{v) for which the incoming and outgoing edges have distinct colors (for 
convenience, in what fohows we shall also consider the root r as a breakpoint of x). We define 

g{v) = maxjj G Z : ^ u£l3^{v), dT{u,v) > 4™'^{9(«)j}| . 

Having defined g{v) we choose one of its children w G '^(v) for which 

fJ'giv) {Fw ) = max (F^ ) . 

Letting u be the father of v on the path -P(v), we set x{v.,w) = xiu,v), and we assign arbitrary 
new (i.e. which haven't been used before) distinct colors to each of the edges {{v, z)}z£^^(v)\{w}- 
other words, given the "scale" j = g{v) we order the children of v according to the size of the copy 
of Mfc which they admit beneath them at scale j. We then continue coloring with the color x(^) ^) 
the path P{v) along the edge joining v and its child which admits the largest at scale j, and 
color the remaining edges incident with v by arbitrary distinct new colors. 

This definition clearly results in a monotone coloring x- To motivate this somewhat complicated 
construction, we shall now prove some of the crucial properties of x g which will be used later. 

Claim 2.11. Let P he any monotone path in T, and let (bi, 62, • • • , bm) be the sequence of breakpoints 
along P ordered down the tree (i.e. in increasing distance from the root). Assume that j £ Z satisfies 
for every i £ {2, . . . , m}, dxibi, ftj-i) < 4-', and assume also that dx^bi, bm) > • 4-' ■ Then there 
exists a subsequence of the indices 1 < ii < 12 < • • • < ik 1^ ^ such that 

1. k> • dT{bi,bm). 

2. For every s £ {1, . . . , k} we have g{bi^) = j. 

3. For every s G {1, . . . , A; - 1} we have ^ • 4^ < drih^^^Ms) < -§=1' ■ 

Proof. We shall show that if i G {1, . . . ,m} is such that dT{bi,bm) > ^3— then there exists an 
index t G {1, . . . , m} with g{bt) = j and dxibt, bi) < Assuming this fact for the moment, we 

conclude the proof as follows. Let ii be the smallest integer in {2, . . . ,m} such that g{bi^) = j. 
Then dT{bi^,bi) < 4-'"'"^. Assuming we defined ii < ^2 < • • • < is, if dribi^^bm) < ■ 4-' we 
stop the construction, and otherwise we let t be the smallest integer bigger than ig such that 
dribt, bi,) > ^ • Since dribt-iMs) < ^ • 4^ we know that dribu h,) < ^ • 4^' + AK Thus 
drih, bm) > drih, , bm) - ■ 4-' - 4^ > (because we are assuming that drih, , bm) < ^ • 4-'). 
So, there exists ig+i G {l,...,m} such that g{bi,_^_^) = j and dT{bi,^^,bt) < 4-'+^. Since by 
construction drih, bij > • 4^ > 4^~^^ we deduce that ig+i > is and 

9 9c ■ 

-4^ < dxihtMs) - dT{bi,+,M) < dT{h,^,As) < dribi^^.M) + dxibtMs) < 7 •4^- 

c— 1 c — 1 

This construction terminates after k steps, in which case we have that 
dT{bi,bm) = dribu h,) + Y^ drih^A.^,) + dribi^^bm) < 4^+^ + {k - I) ■ ■ 4^ + ^ • 4^. 

s=l 
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Since dxibi, bm) > ^zj ' this imphes the required result. 

It remains to show that if i G {!,..., m} is such that dT{bi,bm) > ^3— then there exists 
t G {1, . . . , m} with g{bt) = j and dxibt-, bi) < 4^~^^. We first claim that for every i € {1, . . . , m} 
there is a breakpoint w € Pxi^i) ^^^^ g{w) > J and dxiw^bi) < ^3— • Indeed, if g{bi) > j then 
there is nothing to prove, so assume that g{bi) < j. By the definition of g there exists a breakpoint 
wi G Pxi^i) such that 

Thus necessarily g{wi) > g{bi) + 1 and dxiwi, bi) < 45(''»)+^ < 4-^. If (7(64) + 1 > j then we are done 
by taking w = wi. Otherwise, continuing in this manner we find a breakpoint ^2 € P^i'^^i) ^ /^x(^*) 
with g{w2) > g{wi) + 1 > (7(64) + 2 and dT{w2, wi) < 4^^^'^^'^^. This procedure terminates when we 
find a sequence 6j = wq, wi,W2, ■ ■ ■ ,wt with (7(tL't) > j, g{wt-i) < j, and for every < s < t — 1, 
> g{ws) + 1 and dT{ws+i,Ws) < A^^^^^^+K Thus 

t-l t-l j j+i 

dTibi,wt) = Y,dT{ws+uWs)<Y,'^'^""^^' < Yl ^' = ^- 

s=0 s=0 s=— 00 

Now, assume that dT{bi,bm) > ^3~- Let s be the largest integer in {i + 1, . . . ,m} such that 
dj'{bs-,bi) < (such an integer s exists since dT{bi,biJ^i) < 4^). Then < dT{bs+i,bi) < 
+ 4-'. By the previous argument there is a break point w G P^i^s+i) with 5'(t(;) > j and 
dx^w, bg+i) < ^3—- This implies that w = bt for some t G • • • , and dribi, bt) < ^^+4-'. 

We proved that as long as bi satisfies dr^bi, bm) > ^3—) there are l<t<i<s<m such that 
gibs) > j, g{bt) > i, and dxih, bi) < ^3—, dribs, bi) < ^ — h 4^ . Thus, by the definition of g, 

^min{g(bs),g(bt)} < dribs, bt) < ^^^^ + 4^' < 4^+\ 

3 

It follows that either gibs) = j or gih) = j, as required. □ 

To conclude the proof of Theorem 12.101 we may assume that e* (T) < , since otherwise 
the assertion of Theorem 12.101 is trivial. Fix > e > e* (T) . By the definition of e* (T) , the 
coloring x constructed above is not e-good. Thus, there exist two vertices u,v £ V such that 
the path P(n, does not contain a monochromatic segment of length at least edTiu,v). We 
may assume without loss of generality that u is an ancestor of v, and let (61, • • • , bm) be the 
sequence of breakpoints along this path, enumerated down the tree (i.e. from u to v, not necessarily 
including u or v). Denoting D = ^^(n, u) we have that dTiu,bi),dTiv,bm.),dTibi,bi+i) < eD 
for all i G {l,...,m — 1}. Fix j £ TL such that eD < 4-^ < 4eD. This choice implies that 
dTih,h^x) < 4j and drib^bm) > (1 - 2e)D > ^ ■ 4^ > ^ ■ 4K By Claim EH] there is an 

integer k > 20c.4J^'^'*^ — §EUc ' ^ (using the upper bound on e) and a sequence of breakpoints 
si, . . . ,Sk on the path P(n, v) (ordered down the tree) such that ^(si) = • • • = gis^) = j and for 

ie{i,...,k-i}, ^.4^ <dTisi,s,+i) <^-4K 

The proof of Theorem 1 2 . 1 1 will be complete once we show that ^xic) > k — 2. For i G {1, . . . , A;} 
let ti be the child of Si along the path P(u, v). We will prove by reverse induction on i G {1, . . . , k—1} 



16 



,u 




Figure 3: A schematic description of the gluing procedure in the inductive step. Because Sj was a 
breakpoint it must have two copies of Mfc_j_i at scale j below it. 

that tJ-j{Ft^) > k — i — 1, implying the required result. The base case is true, i.e. fij{Ft^_-^) > 0, 
since the pair (sfc_i, Sk) constitutes a copy of Mq at scale j. 

Assuming that Hj{Fi.) > k — i — 1 we shall prove that Hj{Ft^_j^) > k — i. Since Si was a 
breakpoint, the construction of x implies that there must be a child t'^ of Si, other than ti, for which 
^j{Fii) > fj,j{FtJ > k — i — 1. Thus, there exist one to one mappings /, /' : Mk-i-i — > T such 
that f{mk-^-l) = f'imk^i.i) = Si, /(Mfc„,„i) C F*,, /'(Mfc„,_i) C F^,, ||/||Lip, ||/'||Lip < ^ • 4^ 
and ||/~^||Lip5 II (/')~^ Ikip < Thinking of M^-i as two disjoint copies of Mk-i-i, joined at the 
root ruk-i, we may "glue" / and /' to an embedding / of Mk-i by setting f{mk-i) = Si-i. Since 

• 4^ < dxisi, Si-i) < ■ 4P , this results in an embedding at scale j of Mk_i into Ft-_j^, as 
required (see Figure E]) . □ 

2.3 Embedding into finite-dimensional spaces 

We recall that the doubling constant X{X) of a metric space X is the infimal value of A for which 
every ball in X can be covered by A balls of half the radius. If 5 C X is a (^-separated set in X, then 
a standard observation is that 15*1 < X(^X)'-'^^'^'^^^^/^\ This section is devoted to a simpler proof of 
the following theorem of Gupta, Krauthgamer, and Lee originally proved in [T7j. (We stress that 
the only results we need for this section are Lemma 12.31 and Theorem 12.101 ) 

Theorem 2.12 ([17J). A tree metric T embeds into a finite-dimensional Euclidean space if and 
only if X(T) < oo. In other words, every doubling tree T admits a D-embedding into M^' with D,k 
depending only on A(T). 

Let T = (V, E) be a weighted, rooted tree. Note that the "only if part of Theorem 12.121 is 
trivial. In order to prove the remaining implication we need a coloring notion weaker than e-good. 
Let X : F — > Z be a coloring of the edges of T which is not necessarily monotone. We will say 
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that X is e-reasonable if the following holds for every u, u G V. Let w = lca(ti,t;), and recall that 
P{w,u), P{w, v) denote the paths from w to u and v, respectively. Then there should exist a color 
c £ Z for which 



>edT{u,v). (18) 



eGP(ui,u):x(e)=c e€P(w,i'):x(6)=c 

Since a reasonable coloring is not necessarily monotone, it is possible to construct such colorings 
where x~^iE) is finite even though T might be infinite. The number of colors used, i.e. 
controls the dimension of the embedding from Lemma [ 



Lemma 2.13. Let T = {V,E) be a weighted tree, and suppose that T admits an e-reasonable 
coloring for some e > 0. Then T embeds in (equipped, e.g. with the L2 norm) with distortion 
0{l/e), andk = \x-HE)\. 

Proof. Let x : — > Z be an e-reasonable coloring of T. We use the embedding / : F — > ^2 of 
Lemma 12.31 In particular, it is easy to check that the definition of the embedding does not require 
X to be monotone. Observe that Im(/) lies naturally in spanjcfc : k G x~^{E)}j ^-nd thus we may 
assume that f : V ^ R'' with k = \x~^{E)\- 

From the proof of Lemma 12.31 we conclude that ||/||Lip ^ Ij and thus we need only consider 
||/~^||Lip- But it is easy to see that condition (jlSp suffices to obtain a similar lower bound in 
equation (jH) of Lemma 12.31 □ 

We note that the dependence of k on |x~^(-E')| in the above lemma can be improved to fc = 
0(log using a "nearly-orthogonal" set of vectors instead of the orthonormal set {efcjfcgz- 

We refer to [17] for details. 

Now, clearly ^t(2) < 0(logA(T)) since X{Bm) = 2®(™), hence e*(r) > l/0(logA(r)) using 
Theorem 12.101 In light of Lemma 12.131 and the preceding remark, the following result completes 
the proof of Theorem 12.121 (Note that we can assume T finite by compactness — a tree embeds 
into a finite-dimensional Euclidean space if and only if every finite subset embeds with uniformly 
bounded distortion). 

Theorem 2.14. Let T = {V,E) be a finite, weighted tree. IfT admits an e-good coloring, then it 
also admits an 0{e) -reasonable coloring with only A(T)'^^/^)'^*^''''' colors. 

Proof. We will say that a monotone coloring x '■ E ^ is regular if the following holds: For every 
maximal monochromatic segment s = {ei, 62, ... , e^} C E (with edges ordered down the tree), and 
for every 1 < i < A;, we have ^(ej+i) < 2 ^(e^). 

Lemma 2.15. If a finite tree T admits an e-good coloring, then T admits an O{e)-good regular 
coloring. 

Proof. Let T = {V, E) be a rooted tree, and let xo ^ ^ ^ ^ be an e-good coloring of T. Suppose 
that some monochromatic segment s = {ei, 62, ... , e^} C E violates the regularity condition. Let 
i G [k] be the smallest index for which £{ei+i) > 2 J2]=i ^i^i)- We derive a new coloring xi : -E — > Z 
by coloring the edges ei, . . . , Cj with a new unused color c G Z, i.e. Xi(e) = c if e = Cj for some 
1 <i J ^ ^ and Xi(c) = Xo(e) otherwise. Continue this process inductively until the resulting 
coloring x' : ^ Z is regular. This process terminates because T is finite. It remains to show that 
x' is 0(e)-good. 
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To this end, let s = {ei, . . . , ek} E he a maximal monochromatic segment according to XO) 
and let si, S2, ■ ■ ■ , Sm ^ s be the maximal monochromatic segments of s according to x'j ordered 
down the tree. By construction, we have 

e{Sm) > 2£{Sm-l) > iiSm-l) + 2i{Sm^2) > ■ > ^{si) + ■■■+ i{Sm-l), 

hence £{sm) > ^^(-s)- It follows that x' is a regular e/2-good coloring of T. □ 

Let T be a rooted tree, and let x : -E' ^ ^ be an e-good coloring of T. Using the preceding 
lemma, we may assume that x is regular. Let C be the set of color classes. We will think of segments 
s £ C sometimes as a subset of edges and sometimes a subset of vertices (the endpoints and internal 
vertices of the segments), depending on the context. In everything that follows, we will assume 
that for s ^ s' € C, we have diam(s) ^ diam(s'). This is without loss of generality by applying 
arbitrarily small perturbations to T. (Alternatively, we could fix a total order on segments of equal 
diameter, but this would add unnecessary notation to the proof.) 

For every segment s € C, we define p{s) as the vertex of s which is closest to the root. For every 
So G C and if > we define a relative length function 

length<,jj(s; K) = max jdiam (P(p(s), x)) : x £ sCi Bt [p{sq),K ■ diam(so)^ | , (19) 

where we take s £ C, and we set \engthg^{s; K) = in case the maximum is empty. In words, this 
is how long the segment s £ C "looks" from p{so), where the "view" is restricted to a ball of radius 
K ■ diam(so)- It is important to note that even when s ^ Bt{p{so), K ■ diam(so)), one might have 
< length^p(s; K) <C K • diam(so) since T is not necessarily an M-tree. 

Now we define carefully a directed graph Gq = (C,Ec)- The adjacency relationship on Gq will 
be the key in producing an 0(e)-reasonable coloring. We put 

{s,s') G Ec length, (s';K) > diam(s), (20) 

for some constant > 6 to be chosen later. Observe, in particular, that {s, s') G Ec =^ 
diam(s') > diam(s). We will argue that the undirected graph Gc which results from ignoring 
the edge directions in Gc has its chromatic number bounded solely by a function of A(T). We 
accomplish this with the following sequence of lemmas. (This step is non-trivial since Gc does not 
have bounded degree.) 

Lemma 2.16. For every s £ C, the out-degree is bounded, i.e. 

\{s' £C:[s,s')£Ec}\<\{Tf^''\ 

Proof. Fix s £ C. For every s' £ C with (s, s') £ Ec, let Xgi £ s' be the node achieving the maximum 
in (jl9p . If the maximum does not exist then length, (s'; if) = 0, hence (s,s') ^ Ec- By definition, 
dT{p{s),Xsi) < if • diam(s). Furthermore, dripis'), Xg') = length, (s'; if ) > diam(s). It follows 
that the set Xg = {x,/ : (s,s') £ Ec} is diam(s)-separated. Since X, C Bt{p{s), K • diam(s)), the 
doubling property implies that 

\{s' £C:{s,s')£Ec}\ = \Xs\<XiTf^''\ 

□ 
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For any undirected graph G = {Vg, Eg) and v £ Vg, we define N{v) to be the set of neighbors of 
V in G, we let deg('L') = |-/V(i')| and deg_5('i;) = |A^(i;)nS'| for S C Vg- The next resuh is well-known. 

Lemma 2.17. Let G = (Vg,Eg) be any finite, undirected graph. Let k £ N and let vr : Vg 
{1, 2, . . . ,n} be a permutation. We denote ttj = {v £ Vg ■ tt^v) < j}. If, for every j = 1,2, ... ,n, 
we have 

deg^^_^{7r-\j)) < k, 
then the chromatic number of G is at most k + 1. 

Proof. The proof follows by inductively coloring the elements 7r^^(l), 7r^^(2), . . . ,7r~^{n) in order. 
If we have a palette of /c + 1 colors, then since deg^^_^ (7r~"'^(j)) < k, we can always choose a new 
color for 7T~^{j) that doesn't conflict with any already colored vertex in ttj^i- □ 

Corollary 2.18. IfGc is the undirected version of Gq, then the chromatic number of Gc is bounded 
by A(T)0(^). 

Proof. Let tt : C ^ {1, 2, . . . , |C|} be any permutation for which diam(7r(j)) > diam(7r(j + 1)) for 
1 < i < |C| — 1 (i.e. the diameters of the segments decrease monotonically) . Then combining 
Lemmas 12.161 and the fact that {s,s') G Eq =^ diam(s') > diam(s) shows that for j = 1,2, . . . ,n, 

deg.^._,(vr"Hj))<A(r)«(^). 

Applying Lemma 12.171 completes the proof. □ 

Now let xc ■ C ^ [k] be a proper coloring of Gc using only k = A(r)^(-^) colors. We are done 
as soon as we show that xc is an 0(e)-reasonable edge-coloring of T (where we consider xc as a 
coloring of E in the obvious way) for some choice oi K < (l/£)'^^^^^\ 

Lemma 2.19. Suppose that for s ^ s' £ C, we have 

diam(s n P{u, v)),d\am{s' n P{u, v)) > ^j^, 

K/2 — 1 

where u,v £T. Then xc{s) / Xc{s'). 

Proof. Assume that diam(s') > diam(.s), and let x be the bottom-most point of s' n P{u,v). Then 

K 

dT{p{s),x) < diam(s) + driu, v) < {1 + {K/2 - 1)) diam(s) < — ■ diam(s). (21) 

In this case, length^(s'; K) > d\am{P{p{s'),x)), hence if diam(P(p(s'), x)) > diam(s), we have 
(s, s') £ Ec, which finishes the proof of the lemma. 

So we may assume that diam(P(p(s'), x)) < diam(s). We claim that in this case, length g{s' ; K) > 
diam(s) using the regularity of x- Let y £ s' he such that dT{x,y) < -ydiam(s), and for which 
dT{p{s'),y) is maximal. If dx{p{s'),y) > diam(s), then we are done since by (|21|). we have 
dT{p{s),y) < K ■ diam(s), implying length^(s'; K) > dT{p{s'),y) > diam(s). Hence we may assume 
that dT{p{s'),y) < diam(s). In this case, since diam(s') > diam(s), there exists an edge {y,y') with 
y' £ s' and dT{p{s'),y') > ^ ' diam(s) > 3 • diam(s). But this implies that £{y, y') > 2 ■ dT{p{s'),y), 
which contradicts the regularity of x- 

It follows that \engt\\g{.s' ; K) > diam(s), which again implies {s,s') £ Eq. □ 
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Now fix G V and w = lca.{u,v), and suppose that dT{w,u) > dxiw^v). Since x is an 
e-good coloring, there exists a maximal monochromatic segment (with respect to %) s C for 
which diam(s n li)) > edT{w,u) > (e/2)dr(n, f). Now set K = 4(1)^"''^''^. Applying Lemma 

12.191 we see that for any s' (1 E with xc{s) = Xc{s'), we have diam(s' n P{w,v)) < (f)^^^'''^- But 
now hne ([5]) of Lemma 12.51 implies that segments of this length can cover at most an e/2-fraction 
of P{w,v) (since x is e-good coloring), which is at most an e/4-fraction of P{u,v). It follows 
that xc is a (5-reasonable coloring for 5 = | — | > |, completing the proof. 

□ 



3 Markov convexity and distortion lower bounds 

In this section we study Markov convexity, and show how it can be used to prove several distortion 
lower bounds. In particular, we will discuss the connection between Markov convexity and uniform 
convexity in Banach spaces, and we will prove that Theorem 12.11 is optimal. 



3.1 Markov convexity in Banach spaces 

We start by showing that Hilbert space is Markov 2-convex. This has essentially been proved by 
Bourgain in [6]. We give the following proof here because the argument is extendable to the case 
of p 7^ 2. We refer also to [25] for another variant of Bourgain's proof. 

Lemma 3.1. For every xq, . . . , G L2, 

J^llx, -X,_i||2 = ^^:— ^ + ^— J] ||Xj.2fe -2X(2j_i)2fc-l +X(j-_l)2fe||i. (22) 



2*; 

i=l ~ fc=l i=l 



Proof. Let !Fn be the ci-algebra of subsets of [0, 1] generated by the intervals |/" := ^^^tt, ^ j . 

Define if : [0, 1] ^ L2 by 99 = Xj — Xj^i on Ij". Set ipj = K{ip\J^j), where the expectation is with 
respect to the Lebesgue measure on [0, 1]. In other words, for every j G {1, . . . , 2^} and t G Ij 



j2 

^=2'"-'=(j-l)+l 



1 ^ — ^ ^ ^ X j<2m — k {j \_^'2^~^ 



Since the sequence {(^^ — v^fc-ij^Li is a- martingale difference sequence, and ^jq is constant, the 
functions '^i — ^o-, '^2 — ^i-, ■ ■ ■ ■, — ^m-i are orthogonal (in the Hilbert space L2(L2)). Thus 

m 

Ell^'mlli = IE||(^oll2 + Y^\\^k - y^k-l\\l- 
k=l 

This is precisely the required identity. □ 
We can remove the dyadic bias from (|22p by averaging over shifts. 
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Corollary 3.2. For every xq, . . . , G L2, 

2™- II ||2 m 2™ 

Vllr- T ,||2 > 11^2"' -X0II2 ^Y^o-2fcY^|| ^ 

1=1 k=l t=l 

where, by convention, Xj = xq for j < 0. 
Proof. First, consider the sequence of length 3 • 2"^ — 2, 

Xo,Xo, . . . ,Xo, Xo,Xi, . . . ,X2^, X2m,X2^, . . . ,X2^, 

which is the original sequence with 2™ — 1 copies of xq and X2^ appended to the front and back, 
respectively. Call this sequence {yj}^^i Now average the equality (j22p over all 2"^"'"^ contiguous 
subsequences of length 2™ + 1, i.e. {yi, yi+i, . . . , 2/4+2™} for i = 1, . . . , 2"*+^. By counting terms, 
this yields the desired result. □ 

Theorem 3.3. Hilhert space is Markov 2-convex. In fact, 112(^2) < 4. 

Proof. Let {Xtj^g be a Markov chain on a state space 0, and take / : O ^ L2. By Corollary [321 

2™ ^ m 1 



Y,nf{Xt)-f{x,.,)\\i>\Y,2~''Y.^\\f{Xt)-2f{x,_^.^^^ 

t=i 

lE||/(Xo)-/(X2™)||i 



2 

t=i fc=i t=i 



+ 



22m 



where by convention we set Xt = Xq for t < 0. 

Observe that for every two i.i.d. random vectors Z,Z' £ L2, and every constant a £ L2, 

K\\Z — Z'\\2 < 2¥,\\Z — a\\2. Thus, using the fact that conditioned on X = {Xq, . . . ,X^_2k-i) the 
random vectors f{Xt) and f{Xt{t — 2^~^)) are i.i.d., we see that 



nf{Xt)-f{Xt{t-2^-^))\\l = E E ||/(X,)-/(Xi(t-2^-i))||i 



X 



< 2E||/(Xi) - 2/(X,_2^-0 + /(X,_20ll2- 
Likewise, E||/(Xo) - /(X2™(0))||1 < 2E||/(Xo) - f{X2-^)\\l. It follows that 



m+l 2" 



^ -1 "(.-pj. ^ 

^E||/(X,)-/(X,_i)||i > -^2-2'=^E||/(X,)-/(X,(t-2^-i))||i, 

fc=l i=l 
^ m 2™ ^ 

^ 2-2^ ^ E||/(X,) - f{Xt{t - 2^))\\l 



4 

t=l k=l t=l 

m 2" 

16 



fc=0 i=l 

completing the proof. □ 

Remark 3.1. The above argument can be generalized to prove that p-convex Banach spaces are 
Markov p-convex. Recall that a Banach space X is said to be p-convex with constant K (see [3]) 
if for every x,y G X, 

2 

2iia;ir + ;^iiyir < Ik + yir + - yir. 
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The least such constant K is denoted Kp{X). 
We claim that for every Banach space X, 



Iip{X) < 2(P"i)/P (2P^i - 1) • Kp{X) < 4Kp{X). 

Indeed, repeating the argument of Lemma I3.H we replace the use of orthogonality by Pisier's 
inequality [33] to get that (see the argument in [2] for the constant used below), 

m 

fc=l 

As in the proof of Theorem 13.31 (and using the notation there) this shows that 

2™ 

- i)[Kp{xr]Y,m{Xt) - fixt^im > 
t=i 

k=l t=l 

Since for every two i.i.d. random vectors Z,Z' G X, and every constant a G X, we have that 
K\\Z — Z'\Wr < 2^~"'^E||Z — o||^ (this fact follows from a straightforward interpolation argument), 
we conclude exactly as in the proof of Theorem 13. 3i 

We mention some partial converses to Remark 13. 11 

Corollary 3.4. Let X he an infinite dimensional Banach space. Then I\.p{X) < oo implies that X 
is superreflexive and has cotype q for every q > p- 

Proof. Let qx = ini{q : X has cotype q}. By the Maurey-Pisier [29] theorem, X contains copies 
of with distortion uniformly bounded in n. By Bourgain's embedding of trees into iq^ [6], 
this implies that cx{Bm) = O ((log m)^/'^-^) . From Bourgain's lower bound [6], or alternatively 
Claim 13.71 below, we deduce that qx < p, as required. The fact that X is superreflexive follows 
from Bourgain's characterization of superreflexivity [6]. □ 

Corollary 3.5. Let X he a Banach lattice with Ilp{X) < oo. Then for every q > p, X admits a 
q- convex equivalent norm. 

Proof. This is a direct consequence of a theorem of Figiel [13] (see [22], page 100) which says 
that a Banach lattice with cotype q and non-trivial type can be renormed to be g-convex [X has 
non-trivial type since it is superreflexive). □ 

3.2 Distortion lower bounds 

We can now use the discrepancy between I[p{X) and np(y) to prove distortion lower bounds for 
embeddings between the two spaces. 



Lemma 3.6. Let (X, dx), (^, dy) be metric spaces, then for every p < oo, we have 

cy{X) > 



Ilp{X) 
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Proof. Fix n > Ilp(Y). Let g : X ^ Y he a bi-Lipschitz map, let {^tj^g ^ Markov chain with 
state space ft, and let f : ^ X. Then 



m 2" 



E[dx[f{Xt),f{Xt{t-2'')) 



A;=0 t=l 

_^ ^ - Z; E [dy (g(/ (X,)), - 2^))))"" 

— lis 1 1 Lip ■ / J / J 2fcp 
fc=0 t=l 
2™ 



< 



2m 

< IbllLp • lb~'llLip • • 5;E[dy(/(x,),/(x,_i)f ]. 



t=i 

It follows that np(X) < cy(X) • np(y), as required. □ 

As a warm up to the more involved lower bounds that will follow, we show how Markov convexity 
can be used to prove Bourgain's theorem for complete binary trees. 

Claim 3.7. For every m G N, we have np(i?2'") > 2 v ■ mv . 

Proof. Let {Xj}^q be the forward random walk on i?2'" (which goes left/right with probability ^), 
starting from the root, with the leaves as absorbing states. Then 

2™ 

t=i 

Moreover, in the forward random walk, after splitting at time r < 2™ with probability at least ^ 
two independent walks will accumulate distance which is at least twice the number of steps (until 
a leaf is encountered). Thus 

m 2™ ( Xt Xj- (f — 2^)"]^] "1 2™-2*^ , 

E[dB,^[Xt,X,^t ^)) \ ^ J_ . i . 2('=+i)p > 2^-2 . rri . 2- 

/ ^ / ^ 2^P — / ^ / > 2 — 

k=0 t=l k=0 t=l 

The claim follows. □ 

Since Lp is Markov max{2,p}-convex for every p > 1, combining Claim [3771 with Lemma | 
recovers Bourgain's result [6], i.e. for every p > 1, we have Cp{Bk) > ^(log A;)™"^^^'^} 

Simple random walks with positive speed. In fact, the proof of Claim 13.71 applies in more 
general situations where a random walk has positive speed. We consider some examples. 

Let G = {V.,E) be an infinite, vertex-transitive graph of bounded degree. Let {Xf}^Q be a 
simple random walk on G starting from an arbitrary vertex. Denote by dc the shortest path metric 
on G. One defines the speed of the random walk as the limit 

^.^ EdG(Xo,XO 

t—>oc t 

Subadditivity implies that the limit above always exists. 
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Lemma 3.8. If the speed of the simple random walk on a vertex-transitive graph G is at least s > 0, 
then IIp(Bq{R)) = 0, ((log i?)"*^/^) , where Bg{R) denotes the hall of radius R in G. In particular 
for every p > 1, 

Cp{BG{R)) = n [{log Rr^"^-^'-^^). 

Proof. The proof is similar to that of Claim 13. 7[ One simply observes that for two independent 
simple random walks Xt,Xt started at the same point, we have 

EdG{Xt,Xt) ^. EdG{Xo,X2t) ^ ^ ^„ 
lim = lim > 2s > 0. 

t— >oo t t^oo t 

In particular, for t large enough, with constant probability we have dG{Xt, Xt) = 0,{t). □ 

As an application, consider the lamplighter group over Z'^. This is a group with elements (/, x), 
where x G Z'^, and / : Z"' — > {0, 1} with f{y) = for all but finitely many y G Z"^. Traditionally, 
one imagines a lamp placed at every element of Z'^, where each lamp can either be on or off. In the 
pair (/, x), / denotes the settings of all the lamps, and x denotes the position of the lamplighter. 
Accordingly, the generating set consists of two types of moves. 

1. The lamplighter can move to an adjacent vertex in Z"^, i.e. (/, x) i— > (/, x') where x' is adjacent 
to X in the standard Cayley graph of Z*^ or 

2. The lamplighter can turn on/off the lamp at x, i.e. (/, x) i— > (/',x) where f'{y) = f{y) for 
y ^ X and f'{x) = 1 — /(x). 

We will use L{W/) to denote the associated group as well as the Cayley graph with the described 
generators. A result of Kaimanovich and Vershik [19] shows that the simple random walk on L{lA) 
has positive speed for d> 2. Using Remark 13.11 and Lemma 13.81 we conclude: 

Corollary 3.9. For d > 2, the word metric on L(Z'^) does not embed into any p-convex Banach 
space. In particular, if Bj^(^-^d-^{R) denotes a hall of radius R, then for p > 1, 

Cp[B^^^,~^{R))>^{{\ogRr^^U^). 

We remark that, by a theorem of Varopoulos [36] i the simple random walk on the Cayley graph 
of a finitely-generated group has positive speed if and only if there exists a bounded, non-constant, 
harmonic function on the graph. 

Finally, we consider the finite lamplighter groups over Z^v = Z/(iVZ), which we denote by 
L(Zjv). In this case, the simple random walk on L{'L]\j) does not have positive speed, but it is still 
possible to prove a distortion lower bound because the Markov chains in the definition of Markov 
convexity need not be reversible. In particular, consider the chain {Xf}^Q defined as follows. 
X\) = (/, 0) where / = 0, i.e. all lamps are turned off. If Xt = {f,i), then with probability ^, we 
put Xt-\-i = {f,i + l), and with probability ^ we put Xt^i = {f',i + l) where f'{i + l) = 1 — f{i-\-l). 
Arguing essentially exactly as in Claim [3771 for times t < N, we have the following. 

Proposition 3.10. For every p < oo, we have Ilp{L{7jj\[)) > Q,[(logN)p^. In particular, 

Cp(L(Z;v))>J^((logiVr"^^'^^). 

Proposition 13.101 can also be proved by exhibiting an embedding of a complete binary tree of 
depth @{N) into L(ZAr)— see [26]. 
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3.3 Weak prototypes and Markov convexity 

In this section we study the Markov convexity properties of a special class of trees called weak 
prototypes. These trees will play a central role in Section where Theorem 11.41 is proved. We 
begin with some definitions (we continue using here the notation of Section 12. ip . 

In what follows, by a path metric P = {ui, . . . ,Um) we simply mean a graph theoretical path 
from ui to Um with edges {ui,U2), {u2,U3), . . . , {um-i,Um) and edge weights {^(uj, Mj+i)}™r/ C 
[0,oo). The length of P, denoted i{P), is given by i{P) — '^JLi ^("Wj, lij+i). Given a monotone 
path P in T, and a set of vertices {vi, . . . ,Vm) on P, ordered down the tree and not necessarily 
containing all the vertices of T lying on P, we will call the path metric on (vi, . . . ,Vm) with the 
edge weights {dT{vj,Vjj^i)}Y=i , the path metric induced by T on {vi, . . . ,Vm)- 

Given a path metric P = {ui, . . . , Um) and e,6 £ (0, 1) we shall say the path P is (e, J)-weak if 
at least an e- fraction of the length of P is composed of edges of length at most 6i{P), i.e. 

m—1 

je{l,...,m-l} j=l 

e{uj,Uj+i)<se{P) 

A monotone path P{u, v) in T will be called degree-2 (e, 6)-weak if the following condition holds 
true. Let {ui, . . . ,Um) be the vertices of T on P, ordered down the tree, who have at least two 
children in T. Then we require that the path metric induced hy T on {u,ui, . . . , u^, v) is (e, 5)-weak. 
In other words, call a monotone path P in T a strait if every vertex on P has exactly one child, 
except possibly for the initial and final vertices. Then P{u, v) is degree-2 (e, 6)-weak if at least an 
e-fraction of the length of P{u,v) is composed of maximal straits of length at most 6dT{u,v). 

Definition 3.11. Fix e,6, R > 0. A tree T = (V, E) with edge lengths £ : E ^ (0, oo) is called an 
{£,5) -weak prototype with height ratio R if the following conditions are satisfied. 

• Every non-leaf vertex ofT has exactly one or two children. 

• Every root-leaf path ofT is degree-2 {£,5)-weak. 

• If h is the length of the shortest root-leaf path in T and h' is the length of the longest root-leaf 
path in T, then h' /h < R. 

3.3.1 Markov convexity for unweighted weak prototypes 

First, we will prove a lower bound on the Markov convexity constants of a special class of unweighted 
weak prototypes. Later, we will show that every weak prototype can be approximated by a weak 
prototype satisfying these conditions. 

Theorem 3.12. Let {T^dx) be an unweighted {£,6)-weak prototype with height ratio 1 and height 
2m some m G N. Then for every p > 1, we have 

n,(r)> (I [log2(e/<5)-4])'^'. 
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Proof. Let r be the root of T. Let {Xt}'^Q be the Markov chain on T defined as follows. Initially, 
Xq = r. If Xt is a leaf node, then Xt+i = Xt, and otherwise Xt+i is a uniformly random child of 
Xt. 

First, we have dxiXt-i, Xt) < 1 for every t>l. Thus it suffices to show that 



1 ^^^[dT[Xt,Xt{t-2' 



fc=0 t=l 



e 

> - 
- 4 



log2 ( f 



Recall that a monotone path P in T is a strait if every node of P has exactly one child, except 
possibly for the initial and final nodes. Additionally, say that a node u G T is a branch point if v 
has at least two children. Clearly the edges of every root-leaf path partition into maximal straights 
with branch points at the ends (except for the root and leaves). Let Bk{t) be the event that the 
set {Xt,XtJ^i, . . . ,Xj_,_2fc-i} contains a branch point. Observe that whenever 2^~^ > 62'^, we have 



Pr[;Bfc(t)] > Y2 Pi' Xt falls in a maximal strait of length at most 2 



t=o 



> e2'' 



(23) 



since every root-leaf path of T is degree-2 (e,(5)-weak. Furthermore, if k < m, and t < 2™ — 2^, 
then 



occurs =^ Pr (i7i(X^_|_2fc , X^_^2'= (0) — ^ 



1 

^2' 



(24) 



since upon hitting a branch point, the two chains will diverge with probability at least ^ for at 
least 2^^'^ additional steps. 

We conclude that when 2^~^ > 52"^ and k < m, 



2m _2''- 



t=l 



> Yl ^[d[Xt+2^,Xt+2>^{t) 
t=0 

> ^•2'=^-Pr[Sfc(t)] 



> 2 



kp—l 



e2'' 



(25) 
(26) 



where in (|25p we used (j24p . and in (|26p we used ()23p along with a correction term for boundary 
values of k. Therefore, 



1 ^r^^[dT[Xt,Xt{t-2^ 



fc=0 t=l 



log 



> 



Y max<!0,e2"^ -2« 

fc>l+log2((52™-) 

41. 



The proof of Theorem 13.121 is complete. 



□ 
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3.3.2 Distortion bounds for weak prototypes 

In this section, we sliow liow pass from a finite tree T to a more well-beliaved tree T sucii tliat 
Cp{T) = 0(1) • Cp{T) for every p E [l,oo). We use tiiis transformation to prove distortion lower 
bomids for arbitrary weak prototypes. 

Lemma 3.13. Let [T^dx) he a finite, graph-theoretic metric tree, and let he the R-free that 
results from replacing every edge of e € E{T) hy a closed interval whose length is length(e). Then 
for every p G [l,oo), we have Cp(Tig) < 5cp(T). 

Proof. Fix a root r of T (and, in particular, an orientation of the edges). Let / : T — > Lp be an 
embedding of T. Let {(iuv) uv£E{t) C Lp be a system of disjointly supported unit vectors each of 
whose support is also disjoint from the support of Im(/). Denote a point x G by x = {u,v,ri), 
where uv G E{T), we have dxir^u) < dT{r,x) < dT{r,v), and dT{x,u) = r] • dT{u,v) for rj G [0, 1]. 
Assume that ||/||Lip = 1- We define an embedding (7 : Tk — > Lp by 

g{u,v,r]) = (1 -r])f{u) +r]f{v) + r] dT{u,v)(3uv 

Fix (u, V , rj) , {u' , v' , rj') £ T^. If u' is not a descendant of u or vice- versa then 

{iu,v,r]), {u',v',r]')) = r]dT{u,v) + ij'driu'^v') + dT{u,u'). 

Thus 

\\g{u,v,r]) - g{u',v',r]')\\p 

< \\9{u,v,r]) - g{u,v,0)\\p + \\g{u,v,0) - g{u' ,v' ,0)\\p + \\g{u',v',0) - g{u' ,v' ,r]')\\p 
= Uf{v) - /(n)) + vdT{u,v)(3uv\\p + 11(1 - r?)(/(n) - f{u'))\l 

+ h'ifiv) - f{u')) + r,'dT{u',v')[iu'A\p 
= V iWfin) - fivWp + dT{u,vrf' + (1 - 7?)||/(n) - f{n')\\p 

+,/{\\fiu')-fiv')rp+dT{u',vrY^' 

< 2^/Pr]dTiu, v) + 2^/Pr]'dT{u', v') + (1 - ri)dT{u, u') 

< 2^IP.dT^{{u,v,r,),{u',v',rj')). 

If u' is a strict descendant of u then 

c^Tr {{u,v,ri),{u' ,v' ,r]')) = (1 -r])dT{u,v) + r]' dxiu' ,v') +dT{v,u'), (27) 

and a similar reasoning shows that \\g{u, v, rj) — g{u' , v' , r]')\\p < 2^^^ ■ dx^^ ((u, v, rj), {u' , v' , rj')). The 
case oi u = u' is even simpler, so we have shown that ll^llLip 

On the other hand, we will now show that ||(7~^||Lip < Assume first of all that u' is not a 
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descendant of u or vice-versa. Then 



\g{u,v,r]) - g{u',v',T]')\\P 

> [vdriu, v)r + Wdriu', v'^ + \ ||/(n) - f{u'% - v\\f{u) - f{v)\\, - rj'\\f{u') - /{v'^f 

> [vdT{u,v)r + WdT{u',v')r + 



1] dT[U, V) — 1] dT[u ,v 



11/ 



I Lip 



> - 
- 2 

> 2 



1 / driu, u') 



+ r]dTiu,v) +r]'dT{u',v') 

11/ IlLip 

1 / dTj, {{u,v,'n), {u',v',r]') ) ^ "1 ^ 
5 V ll/-'llLip 



(28) 



Where in (j28p we used the convexity of the function a 
we have \a\P + \b\P + \c\p > | ||a + §6 + ic|^. 



\a\P, which imphes that for ah a,b,c G 



If u' is a strict descendant of u then dxf^ {{u,v,rj), {u' ,v' ,rj')) is given by (j27p . Denote this 
distance by D, and for the sake of simphcity write L = ||/~^||Lip- Since 

\\g{u,v,v)-g{u',v',rj')\\P > [ij dT{u,v)]P + Wdriu' ,v')r > 2^~p [r, driu^v) + r,' driu^v')]" , 
we may assume that ridxiu, v) + rj'dxiu' , v') < W. In this case 



iu,v,v) - 9{u',v',r]') 



> 



1(1 - v)f{u) + vfiv) - (1 - V')fiu') - r]'fiv')\ 



> \\f{n)-f{u'% 
dxiu, u') 



> 



> 



> 



> 



L 



ri\\f{n)-f{v%-v'\\fin')-f{v')\\p 
ridxiu, v) — ri'dxiu' , v') 



D -r]'dT{u',v') 



L 

D D 

D 
5L' 



2D 



2D 



The remaining case is when u = u' and v = v' . But then \\g[u,v,rj) — g{u' ,v' ,rj') 
\\f{u) — f{v)\\p, and the required lower bound is trivial. 

We have thus proved that H^HLip • [^"^IlLip < 5||/||Lip • ||/~^||Lip, as required. 



\ri — 1] 



□ 



Remark 3.2. The above lemma does not hold if we allow "Steiner" nodes in the tree T. To 
observe this, consider the subset L C Bm of leaves of a complete binary tree of height m, and let 
r be the root of Bm- Then it is not difficult to see that C2{L U {r}) < 0(1) (independent of m), 
while C2{Bm) — > cxd by Bourgain's theorem for Bm [6]. 

We now replace any weak prototype by an "equivalent" prototype with height ratio 1. 

Lemma 3.14. Let {T^dx) he. any finite metric tree. Then there exists a finite, unweighted metric 
tree {T,dj,) with height 2™ for some m G N such that Cp{T) < 0(1) • Cp(T) for any p G [l,cx)). 
Furthermore, ifT is an (e, 5) -weak prototype with height ratio R, then T is an unweighted 
weak prototype with height ratio 1. 
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Proof. Fix a root r of T. Since T is finite, by rescaling and paying an arbitrarily small dis- 
tortion, we may assume that all edge lengths are integral. For every node x £ T, Let m = 
[log2 maXx^T dxix, r)] . We now define a tree T' as follows. For every leaf i € T, define a new node 
£, and create a new edge of length 2*" — dT{r,£). Thus the length of every root-leaf path in 
T' is exactly 2"^. To see that Cp(T') = ■ Cp(T), let f : T ^ Lp be an embedding of T, and let 
{Pi} ^ Lp be a system of disjointly supported vectors each of whose support is disjoint to Im(/). 
One can extend the embedding by defining f{I) = f{i) + dr'iij) ■ (3e so that Cp{T') < 0(1) • Cp{T). 
Observe that if T had height ratio R, then the length of any root-leaf path in T' has increased by 
at most a factor 2R over its previous length in T. 

We pass from T' to using Lemma [3. 131 and then to T by simply taking the vertex set of T to 
be V{T) = {v £ T' : dj" {v, r) G N}. We define df as the unweighted shortest path metric on T. then 
{T,df) embeds isometrically into T^. Hence Cp{T) = 0(1) • Cp{T^) = 0(1) • Cp{T') = 0(1) • Cp{T). 
Furthermore, every root-leaf path in T has length precisely 2"^. Finally, observe that if T was 
(e, (5)-weak with height ratio R, then T is an unweighted {e/{2R), (5)-weak prototype (because some 
root-leaf path from T might have increased by a factor of at most 2R) with height ratio 1. □ 

The following corollary follows from Theorem 13.121 and Lemma 13.141 

Corollary 3.15. Let {T^dx) he an {e,5)-weak prototype with height ratio R, then for any p > 1, 

cp{T, dr) > n{i) ■ cp{f, df) > ■ Ug{f, df) > n{i) ■ (| log (^) ) , 

where q = max{2,p} and T is the associated unweighted prototype from Lemma \'j.l4 , 

The corollary follows by applying Theorem l3.12l to T and using the relationship between Markov 
convexity and distortion from Lemma 13.61 along with the known Markov convexity of Lp spaces 
(Remark El]). 



3.3.3 The Cantor trees 

Recall that in Theorem l2.lt we showed that for any tree T and p > 1, we have, for every c > 1, 

cp(r)<o(i)(^-^.^T(c)J " . 

Here, we will show that this dependence on ^t(c) cannot be improved by exhibiting a family 
{Cjj^Q of metric trees with |Q| — > oo and such that for any fixed c > 1, 

Cp{Ci) > fl(l) • Iim..{2,r>}{Ci) > m) ■ =^c.(c)"'°{^'^l (29) 

Let T be a rooted (unweighted) graph-theoretic tree. For any root leaf path P = {vq, fi, . . . , Vm}-, 
we define the downward degree sequence di{P) = {dj(uo), dj(fi), . . . , (i|(um)} where cij(f) is the 
number of children of v in T. We will say that T is a spherically symmetric tree (SST) if for any 
pair of root-leaf paths P^P' we have di{P) = di{P'). Clearly any such tree can be completely 
specified by giving the degree sequence of a root-leaf path (see Figure H]). 
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Figure 4: A downward degree sequence and the corresponding SST. 



Definition of the Cantor trees. We now describe a family of downward degree sequences 
inductively. For two sequences S, S' we define S <^ S' as their concatenation. For every i G N, we 
use ones(z) = (8>*{1} to denote a sequence of i ones. Now define 5*0 = {2} and inductively 



Hence the first few sequences are {{2}, {22}, (22 1 22}, {22122 111 22122}, .. .}. To make these 
proper downward degree sequences, we define Si to be Si except with the last element changed 
from 2 to 0. Finally, we let Cj be the unique SST with downward degree sequence Si. We call these 
Cantor trees because the patterns of 2's resemble finite approximations to the middle-thirds Cantor 
set. It is clear that length(S'i) = 2-length(S'i_i) + 2*-^ - 1 = i-2'--^ + l, and that loglog|Ci| = e(i). 
The next two lemmas are somewhat less obvious. 

Lemma 3.16. For every i > 1, the tree Ci is a (^, 2~'''^^)-weak prototype. 

Proof. We need to show that every root-leaf path in Ci is degree-2 (^, 2~*/'^)-weak. Fix any such 
path P. It is easy to see that the maximal straits in P are given by consecutive sequences of I's 
in the downward degree sequence of Cji A sequence of k consecutive I's refers to a strait of length 
A; -|- 1. Therefore for every j < i — 1, there are 2^~^~^ disjoint maximal straits of length 2-' in P. 
The question becomes how small we need to choose m before 



Si+i = Si^ ones(2* - 1) ® 5^. 



^height(a) = ^(i-2-i + 2) < J]2 



(i — m). 



j=m 



Clearly we must have m < i/2, hence Ci is a (^,5)-weak prototype for 



5 



2i/2 



i ■ 2*-i + 1 



□ 



Combining this with Theorem 13.121 yields the following. 
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Corollary 3.17. For every p < oo, Ilp{Ci) > O (i^/^). 

The following claim completes the proof of ()29p . 
Claim 3.18. For every fixed c > 1, =^(7. (c) < 0{i) as i ^ 00. 

Proof. The idea of the proof is simple: If the edges of are mapped far apart in Cj, then we 
can use the diameter of Cj to upper bound the size of m. Otherwise, if the edges are mapped close 
together, then essentially the entire image of Bm must lie inside some copy of Cj_i in Ci. This is 
because there is a "buffer" of length 2*~^ between copies of Cj_i in Ci which contains no branch 
points. An edge of B^ must stretch over this buffer if the image of B^ spans multiple copies of 
Ci-i. ^ 
For our induction, it will be easier to bound ^^^ {c) for a slightly different family of trees Cj. 

Let Ci be the tree Ci with the following two additions: 

1. We append a path Hi of length 2*"^ to the root of Cj. 

2. We append a path of length 2*"^ to every leaf of Cj. We will use C = {Lj} to refer to this 
family of paths. 

Clearly ^cAc) < ^cM)- 

We may assume that i > 1 is sufficiently large with respect to c. Let / : B^n — > Cj be a 
bi-Lipschitz embedding of Bm into Ci with distortion c = ||/||Lip • ||/~^||Lip- Assume, for the sake 
of contradiction, that m > 256 i • clog(c + 1). 

Clearly 

diam(Ci) > max dQ [f[u),f{v)) > tttitTi — > • 

u,v£B„, 11/ MiLip C 

Since diam(Ci) < i • 2*"*"^, we conclude that 

max dQXf{u)J{v)) = ll/llLip < 11^-— 1 < J ' (30) 

uv&E{Bm) ' ^ m log(c+l) 

where E{Bm) is the set of edges in Bm- 

We will now show that ()30p implies that f{Bm) is contained completely inside an isometric 
copy of Ci-i. By induction, this will be a contradiction and finish the proof. Let us consider a 
"top-down" decomposition of Ci into disjoint pieces. From the root downward, we see Hi, then a 
copy of Ci, then the family of paths C. If we also break Ci into constituent pieces, we see: 

1. Hi, 

2. a copy Q-i, 

3. a family of paths V of length 2*~^ connected to the leaves of (2), 

4. copies of Ci-i connected to every endpoint of the paths from (3), 

5. the family of paths C connected to the leaves of the copies of Ci from (4). 
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We now define a family of disjoint sub-trees of Ci each of which is an isometric copy of Ci-i. 
The first copy C^^^^ consists of the bottom 2*~^ nodes of Hi, the copy of Ci_i from (2) above, and 
the top 2*~^ nodes of each path p G V (from (3)). The other copies are indexed by paths p £ V. 
For each such path, we construct using the bottom 2'~^ nodes of p, the copy of Ci_i from (4) 
connected to the bottom of p, and the top 2*~^ nodes of each path from £ connected to this copy 
ofCi_i. 

We claim that there exists some j G {0} U P for which f(Bm) C cf^\. We now prove the 
most difficult case; the other cases are similar. Suppose, for the sake of contradiction, there exist 
x,y £ Bjn for which f{x) G c\^\ and f{y) € C^^-^ for some p £ V. By ([30|) . every edge of Bm 
has length at most 2*~^, hence there must be some node z G B^^ for which f{z) lies in the middle 
2*"'^ nodes of p. In particular, |Bg (/(z), r)| < 2r + 1 for every r < 2*~^, since B^ {f{z),2''~^) C p. 
Furthermore, |/(-B„) n Sg^(/(z),r)| < (2r + l)||/"iLip- On the other hand, \BB„iz,r')\ > 2'^'/2 
for r' < m. 



But we have 



Let r = min{2ic ||/||Lip) 2* }. Since, in particular r < ?7i||/||Lip) the above considerations yield 

2-fc^(2..1)||/-||.„.(|±a£,^. (31) 

Observe that the inequality 2^ > 8Bc hold as long as B > 101og(c+ 1) and c > 1, but it is easy to 
check that for i > 10, we have > 101og(c+ 1) (using (^U\i ). yielding a contradiction. This 

completes the proof. □ 

Remark 3.3. Observe that the two point space A = {x, y} with, say, d{x, y) = 1 is a tree metric 
for which Ylp{A) = 0(1) for every p < oo. On the other hand, it is easy to see that np([0, 1]) = oo 
for every p < 2, thus in general Ilp(T-^) 56 Ilp(T) for p < 2. For p > 2, the relationship is less 
clear, though we suspect that a similar phenomenon holds in this case. A possible example for 
which 112 (Tj^*^) 7^ 112 (T^*)) is when T^*^ is the Cantor tree Ci with every maximal strait replaced by 
a single long edge. Using techniques similar to Lemma \3.18\ one might show that n2(r(*)) = 0(1) 

as i ^ 00 while 112 (Tj^*"*) ~ 112 (Cj) — > 00. We do not pursue this line of reasoning further in the 
present work. 



4 Characterizing the distortion: strong colorings and Markov con- 
vexity 

In this section we will continue to use the notation of Section [2.11 Moreover, unless explicitly stated 
otherwise, all paths will be assumed to be monotone. Many of the concepts and definitions used 
in this section were introduced in Section [3.3^ so we suggest that the reader will be familiar with 
Section 13.31 before reading the present section. 

The following result, which contains Theorem 11.41 is the main theorem of this section: 
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Theorem 4.1 (The Lp distortion of trees). For 1 < p < oo and every metric tree T = (V, E), 



Cp{T) = e (nmax{p,2}(7lR)) = © 



log 



6*{T) 



where the implied constants may depend only on p. 

Before proving Theorem 14.11 we make some observations. By Lemma 13.61 for every q > and 
every two metric spaces {X,dx) and (y, dy), cy(X) > ^^^yy- Since Lp is max{p, 2} uniformly 
convex, Remark 13. II implies that IlmsLx{p,2}{I^p) < co. This observation, together with Theorem 
and Lemma |3.13| implies that 



^ (n„,ax{p,2}(TR)) < -Cp(rM) < Cp{T) < O 



log 



(5*(r) 



Thus, using Corollary 13.151 the proof of Theorem 14 . 1 1 will be complete if show that if a metric 
tree T = {V, E) does not admit any 5-strong coloring then there exists a subtree of T which 
is (0(1), 2 • (5^''^))-weak prototype with height ratio 0(1). It is clearly enough to prove this for 
small enough 5, so we assume in what follows that 5 < (140)"^^^*^ (the proof below yields much 
better constants, but we chose this rough bound to simplify the ensuing exposition). The proof of 
this assertion is analogous to the proof of Theorem 12.101 where "strong" colorings replace "good" 
colorings, and weak prototypes take the place of complete binary trees. Since the structure of 
a weak prototype is not as cleanly recursive as that of a complete binary tree, there are some 
inevitable added complications. The argument will be broken down into several steps. 



4.1 Preliminary results on paths in trees 

In what follows, given u,v £ V we shall say that a set of consecutive edges C C P{u, v) is a (5-cluster 
if ^(e) < 6dT{u,v) for every e G C. 



Assume that u,v £ V are such that 



Lemma 4.2. Fix a G (0, ^), 5 G (0, 1), and denote r = 2 -4a • 
the path P{u,v) is (| + a, 6) -weak. Then at least an a-fraction of the length of P{u,v) is covered 
by 6-clusters of length at least T5dT{u,v). Moreover, at least an a-fraction of the length of P{u,v) 
is covered by edge-disjoint 6-clusters of length between T5dT{u,v) and (2r + \)5dT{u,v). 

Proof. Fix u,v € V and denote P = P{u, v) and d = dxiu, v) = i{P). Let M be the set of maximal 
5-clusters (with respect to inclusion) contained in P. In what follows, for a 5-cluster C C P we 
write e{C) = Eeec^(e)- Define S = {C e M : ({C) < rdd}. For every C e S, since C is a 
maximal (^-cluster, there is an edge ec G P \ C which is incident with an edge in C, such that 
£{ec) >Sd>^-^. Note that for every edge e G P, |{C G 5 : ec = e}\<2. Now, 



<^i{ec)<2 iie)<2 



eeP 
e(e)>5d 



( 



\ 



\ 



eGP 
i{e)<5d 



< 2 1 - - 



a 



d={l- 2a) d. 
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Using the fact that the path P is + a, 5)-weak, we see that 

(^+a\d< £ie)= Yl m + Y^(^^^ E m + a-2a)rd. 

^ ^ eGP CeM ces C£M 

e{e)<5d e{C)>T5d £{C)>T6d 

Recalhng that r = „_^. , we see that ^ ceA/ ^(C") ^ ad, as required. 

e{C)>TSd 

The final assertion of Lemma [4. 21 is simply the fact that for any weighted path P = (ui, . . . , Um) 
such that for each j G {1, . . . , m — 1} we have £{uj,Uj+i) < a, but "^^=1 £{uj,Uj+i) > A, 
there are indices 1 = pi < P2 < ■ ■ ■ < Pk = rn such that for all j € {1, . . . , /c — 1} we have 
Y^i=Pj ^(^i) ""j+i) £ [^) 2^4 + a]. Indeed, let p2 > pi be the first index such that Yl^=pi ^ 
A. Then Yl^=pi ^ A + a. Continuing inductively as long as the length of the remaining 

path is at least A we find 1 = pi < P2 < • • • < Pk such that for j G {1, . . . ,k — 1} we have 
X^i=Pj ^Ui, Uj-|_i) G ^ + a], and Yll^p^ ^Ui, Uj+i) < A. The required result follows by replacing 
Pk with m, which increases the length of the final segment by at most A. □ 

In order to proceed we need to generalize the notions of e-good and 5-strong colorings. A 
coloring % : — > Z will be called (e, 5)-strong if it is monotone, and for every u,v £ V 

Y^liu,v) ■ l{q(^u,v)>5dT{u,v)} > edT{u,v). 

kez 

Note that we can always assume that e > 5. Using the terminology of Section 12.11 an e-good 
coloring is the same as an (e, e)-strong coloring, and a (5-strong coloring is the same as a 
strong coloring. Thus the following lemma is a generalization of Lemma 12. 5[ 

Lemma 4.3. Fix e G (0, ^] and 5 G (0, e). Then any (e, 5)-strong coloring is is also a (^-^^^^^ -strong 
coloring. 

Proof. The proof is a slight modification of the proof of Lemma 12.51 Let x '■ ^ ^ "L he aii (e, 5)- 
strong coloring, and denote = — — We shall show that for every a G (0,1] and u,v G U, 

the total length of the monochromatic segments of length at least adT{u,v) on the path P{u,v) 
satisfies 

• l{lX^u,v)>adTiu,v)} ^ - (t) ) dT{u,v). (32) 

There are points ai,bi, a2,b2, ■ ■ ■ , am,bm G U, ordered consecutively (from u to v) on the path 
P{u, v), such that the color classes of length at least ddxiu, v) on the path P{u, v) are precisely the 
intervals {[aj,6j]}^^. Denote for the sake of simplicity 60 = w and am+i = v, and define /3 > 
by pdT{u,v) = X]j=i '^T(o^j, Since the coloring is (e, (5)-strong, we know that (5 > e. By the 
definition of (3 we are also assured that m < (3/6. li a > 6 then inequality p2p holds vacuously, so 
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we assume that a < 5. Arguing inductively as in the proof of Lemma 12.51 we see that 

feez j=i j=o \ \ T\. j, j+ijy I 

3=0 

e / 1 ™ \ 

> dT{u,v)-{m+l)(^^) -^^^dT(6„a,+i) 1 (33) 

= dT[u,v) - {m + 1) [dT[u,v)\ I 



m + 1 



(^^y {m+lf{l-Py-'jdTiu,v) 
> (l-(f)'(f + l)\l-/?)-^)dHn,.) (34) 

f)' 0)^^(1 -^)^-^jdHn,.) (35) 
- [^-{^y)dT{u,v), (36) 

where in ([33|) we used the concavity of the function t ^ t^~^, in (j34p we used the fact that m < (5/5, 
in (I35p we used the fact that the function s ^ s^{l — s)^~^ is decreasing on [6, 1] and that (i > e >9 
(which follows from the definition of 9 and the fact that e > 6), and in (j36|) we used the elementary 
inequality (|)^ e^{l — e)^~^ < 1, which is equivalent to < \og[[i-l:)5^/'[2e) ] ' ^^^^ follows from the 
definition of 9 since e < ^- Q 

Recall that for i? > a subset iV of a metric space X is an i?-net if for every distinct x,y £ N 
we have d{x, y) > R and for every z S X there is x G with d{x, z) < R. In what follows we shall 
use the following variant of this notion. 

Definition 4.4. Let T = {V, E) he a tree rooted at r with edge weights i : E ^ (0, oo). For R > 
we shall call a set N C V an upward R-net of T if for every x,y £ N such that x is an ancestor 
of y we have dT{x,y) > R and for every v £ V there is x £ P{v) H N such that dT{v,x) < R. In 
other words, N is an upward R-net of T if and only if for every v £ V , N r\ P{v) is an R-net in 
P{v). 

Observe that an upward ii-net in T need not be an i?-net in T. However, the following easy 
lemma shows that upward i2-nets always exist. 

Lemma 4.5. T admits an upward R-net for every R > 0. 
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Proof. The proof is an easy induction on V. For |y| = 1 the result is trivial. Assume that |y| > 1 
and let u € y be a leaf of T. Let u £ V he the father of v. By the inductive hypothesis the tree 
T' = (y\ {v}, E\{{u, v)}) admits an upward i2-net A^'. Thus there exists x £ N Ci P{u) such that 
dT'ix,u) = dT{x,u) < R. If i{u,v) > R — dT{x,u) define N = N'[j{v}, which is clearly an upward 
R-net in T. Otherwise dxiv, x) < R, and since x G Piv), it follows that A^' is also an upward i?-net 
in T. □ 



4.2 Construction of a special coloring and the proof of Theorem 14.11 

Our basic strategy is similar to the proof of Theorem 12.101 To emphasize the similarities between 
the two proofs, we will use the same notation for the weight functions and the coloring that we 
construct (this will not cause any confusion since Section 12.21 can be read independently of the 
present section) . As in the proof of Theorem 12.101 we will define a weight function Hj on subtrees 
of T, and a "scale selector" g : V ^ {oo}, which will be used to construct a coloring x of T. 
The fact that x is ^^t (5-strong will be used to find an appropriate copy of a weak prototype in T. 

We begin with some notation. Let Q be a (weighted) path with initial vertex x and final vertex 
y, and let F be an arbitrary tree with root y (but otherwise disjoint from Q). For e,6 € (0,1) 
and L > we define p(e, 6, L; Q, F) to be the least minimum distance from the root to a leaf in a 
subtree F' <^ F which satisfies the following three conditions: 

1. Every non-leaf vertex of F' has exactly one or two children. 

2. Let P be a root-leaf path in F' , and let P be the vertices on P which are either one of the 
endpoints of P or have 2 children in F' . Then the path Q U P is (e, (5)-weak. 

3. Every path from x to a leaf of F' has length at most 3L. 

Next, we construct a monotone coloring x '■ E ^ Z and a "scale selector" g : V ZU{oo} 
in a similar way to what was done in Section 12.21 Along the way we will also construct weight 
functions {fis}sez on subtrees of T. As in Section [2?2] we start by setting g{r) = oo and we assume 
inductively that the construction is done so that whenever v £ V is such that g{v) is defined, if u 
is a vertex on the path P{v) then g{u) has already been defined, and for every edge e £ E incident 
to V, x(e) has been defined. 

For every t € Z let Nt be an upward 4*-net of T. Since At is an upward 4*-net, for all w G V we 
are assured that NtnP{w)riBTiw, 4*) 7^ 0. We define Xt{w) to be the point in NtriP{w)nBTiw, 2-4*) 
which is furthest away from w. Now let t{s) £ Z he such that 

240 • (5"2^ • 4^" < 4*('') < 960 • ■ 4^ 

Take v £ V which is the vertex closest to the root r for which g{v) hasn't yet been defined, and 
as in Section [2.21 we set 

g{v) = max |i G Z : y u £ I3^{v), dT{u,v) > 4"^i'^{9W,i}| . (37) 

Recall that /9x(^) denotes the set of breakpoints of x along the path P{v), and that by the inductive 
hypothesis the path P{v) has been entirely colored. Let F be a subtree of T rooted at v. We shall 
now define fJ-s^F). To this end define a subset of the path P{v) by 

Qs{v) = {Xtis){v)}[J{w £ P^{v) : g{w) = g{v) and Xt{s){w) = Xtis){v)} ■ (38) 
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With this notation we can define 



In ([39]) we extended the definition of Hs to all subtrees of T rooted at v. We next choose one of the 
children of v, w £ '^{v), for which 

l^g[v){Fw)= max /X 

Observe that is defined for all the children of v, since is a subtree of T rooted at v (it 

is the subtree rooted at z € '^{v) together with the incoming edge {v, z}). Letting u be the father 
of V on the path P{v), we set xiv,w) = xi^jv), and we assign arbitrary new (i.e. which haven't 
been used before) distinct colors to each of the edges {{v, z)} z£'ff[v)\{w}- 

This construction yields a monotone coloring x, a function (7 : y — > Z U {00}, and weight 
function {^s}s£Z defined on subtrees of T. In particular, we note here that Claim \2.11\ still holds 
true, since its proof only used the fact that g was defined as in (jSTh . and this formula is identical to 
the one used in Section 12.21 The following lemma contains the crucial properties of the coloring x- 

Lemma 4.6. Assume that the above coloring x is not 5-strong. Then there exists a sequence of 
vertices Q = {x, wi, . . . , wn), ordered down the tree, and a number L > 0, such that if we define 
s,t £Z by < 2^0(52880 L < 4* and 240(5~ also 4^ < 4* < 9605" also 4*, then the path metric induced 
by T on Q has the following properties: 

1. For every j G {!,..., N} the vertex wj is a breakpoint of x- 

2. For every j G {1, . . . , N} we have g{wj) = s and Xt{wj) = x. 

3. The path Q is (^^^ , S 2880^ -weak. 

4. The length of Q satisfies i{Q) = dxix^w^) G [sto'"^-^] ■ 

Before passing to the proof of Lemma 14.61 we show how it can be used to complete the proof of 
Theorem 14.11 

Proof of Theorem \4.1\ With Lemma 14.61 at hand, the proof of Theorem 14.11 is similar to the final 
step of the proof of Theorem 12.101 Assume that 6 > 6*(T). Let Q = {x,wi, . . . ,W]\f) be the path 
constructed in Lemma [4.6|. and we shall also use the same s, t, L obtained there. Observe that using 
the notation in (j38]) we may assume that Q = Qs{w]y). Indeed, by adding to Q any additional 
breakpoint w x along P{wn) with g{w) = g{w]\f) = s and Xt{w) = Xt{w]y) = x we do not change 
the conclusion Lemma 14.61 

For i £ {1, N — 1} let Zi be the child of Wi along the path P{x, wn), and denote for the sake 
of simplicity z^v = 1^1^. We shall prove by induction on i > that the subtree of T rooted at zjsf-i 
(i.e. the tree Tz^_J has a further sub-tree Wi satisfying the following properties. 

1. Every path from x to a leaf of Wi has length at most 3-4*. 

2. Every non-leaf vertex of Wi has exactly one or two children. 
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3. Let P be a path from z^-i to a leaf of Wi, and let P be the vertices on P which are either 
one of the endpoints of P or have 2 children in Wi. Then the path (xjWi, . . . , i^tv-i) U P is 
(2M)''^^)-weak. 

4. Every root-leaf path in Wi has length at least dT{z]\f-i,W]\f). 

For i = we just take Wq to be the singleton wn, and the fact that the required properties are 
satisfied is asserted in Lemma [4. 61 Similarly, for z = 1 we let Wi be the tree consisting of the single 
edge {zj\[~i,wpf) which satisfies the required properties due to Lemma l46l Assuming the existence 
of Wi we proceed inductively as follows. Since WM-i is a breakpoint of x, the construction of x i^i 
the proof of Theorem 12.101 and the fact that g{wN~i) = s, implies that there is a child z'^_^ of 
WN-i for which fisiF^'^ ) > ^Xs{FzN~i) (recall that for n G ^ the tree is the subtree rooted at 
u plus the edge joining u and its parent in T). Now, since Qs{wn) = Q we also know that (since 
Xt{w]\f-i) = x) Qs^WN-i) = {x,wi, . . . ,wj\f~i}- Thus by the definition on fis in ([39]) 

(25^0''^'^''''^^^"^-^)'^--) =K^0' 



f^s{Fz^_J = pi —— ,52880,4*; g,(w;^_i),F^^_, ] = p{ l^,^^""^" A^;{x,wi, . . . ,WN-i} ,F,^_ 



But, Wi is a subtree of Fz^_- in which every non-leaf vertex has two children, for every path from 
3; to a leaf of Wi the path metric induced by T on the vertices which are either x, or one of the 
Wj, or a leaf in Wi, or have 2 children in Wi, is ^2M)' (^^^^-weak, every path from 2; to a leaf of 
Wi has length at most 3L < 3 • 4*, and the minimal distance from a root to a leaf of Wi is at least 
dT{zN-i,W]\f)- Thus the definition of p implies that ^s{Fzj^_i) > dxizN-i^WN). It follows that 
Ps{Fz'^ .) > dT{z]\f-i,wj\[), implying the existence of a subtree W^ of F^i^ , which has the same 
properties as those stated above for Wi. Joining these two subtrees at WN-i, and adding an edge 
from ZN-i+i to w^-i we obtain a subtree WN-i+i rooted at zjy-i+i with the desired properties. 
We recommend that the reader will follow the above construction using a drawing analogous to 
Figure [31 

The tree T' obtained by joining the edges {x,wi), {wi,zi) to Wjy-i is a subtree of T which is 
a (^2^ , 6 2880 ^ -weak prototype with height ratio at most ^^^'^^-^ < = 6800. As explained in 

the discussion following Theorem 14. 1[ this completes the proof. □ 



Thus, all that remains is to prove Lemma 14.61 

Proof of Lemma \4-6[ Since we are assuming that the coloring x is not 5-strong, Lemma [4. 31 implies 
that X is also not ( gg^, 2^0'^^^ j'^^rong. Thus there exist two vertices u,v £ V such that more 



than a |||-fraction of the length of the path joining u and v is covered by color classes of length 
less than ^^Jisso . dT{u,v). Let (61, . . . ,bm) be the breakpoints of the coloring x along the path 
P{u,v), ordered from u to v. We also denote bo = u and bm+i = v. Thus 

959 /I 479 \ 

dr{b,^„b,) . > ^dr{u,v) = + 960 J 

Lemma 14.21 (with a = |^ and r = 2 -4a ~ ^^^) i^pli^s that there exists a sequence of indices 
< pi < (?i < P2 < 92 < • • • < Pk~i < Qk~i 1^ Pk < Qk + 1 such that for every 1 < i < k 
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we have dT{bp^,bqJ G 52S8QdT{u,v),35^8sodT{u,v) and every Pi < j < qt satisfies dT{bj~i,bj) < 
■^§2880 dT{u,v). Moreover, the total length of these "long ^^o^^^'^l^sters" is 

479 

5Zcir(6p.,6,J > — (iT^t;). (40) 

i=l 

It follows in particular from (gOD that k > ■ S'^m > 20 (since 5 < (UO)-^^^^). 

Denote L = dj'{u,v) and recall that s G Z is defined by 4*~^ < 23o'^^^^ — -^^-^ 1 < i < k 
and apply Claim 12.111 to the path P{bp■,bq^) with c = 2 (which we are allowed to do by the 
definition of s). It follows that there exist at least two indices pi < < J2(^) ^ Qi such that 
9{bj^{i)) = gibnit)) = s and 9 • 4^ < dT{bj^{i),bj^^i)) < 18 • 4^ 

Now t G Z is given by by 240(5~ 2880 4** < 4* < 960(5~ 288o4^. Note that by the definition of s this 
implies that L < 4* < 16L. For each point w G ^^2(1)' • • • ' ^ji(fc)' ^ii(fc)} vertex Xs{w) is in 

BT{w,2-4^)nNtnP{w) C Bt{v,2 ■ 4:^ + L) n NtH P{v) C Bt{v,3 ■ 4^) n NtH P{v). Since NtnP{v) 
is 4*-separated, it follows that there are at most 4 possible vertices which could equal Xt{w). Thus 
there is a vertex x G V and a subinterval J C {1, . . . , A;} of size at least ^ — 1 > | (since k > 20) 
such that for all i G J we have At(6jj(j)) = \t{bj^{^i^) = x. Note that since x = \t{w) for some 
w G ^j2(i)' • • • ' '^ii(fc)}' know that x is the point in Nt H P(t(;) H Bt{w, 2 • 4*) which 

is furthest from Since Nt is an upward 4*-net, there is a point y NfCi P{u) H Bt{u,4:^). So, 
using dxiwju) < L < 4^ , we see that yGNtf] P{w) D Bt{w, 2-4*). Thus x G P(y) C P(n), I.e. X 
is closer to the root than u. 

Consider the path metric induced on the vertices Q = {x}U{5j^(j)}jgjU{6j2(j)}jgj. For simplicity 
of notation we enumerate it down the tree by Q = (x, wi, . . . , wn)- We bound the length of Q as 
follows. First, 1{Q) = dT{x,WN) = dT{x,wi) + dT{wi,WN) < 2 • 4* + L < 3L. On the other hand, 
using (j^0|) we see that 



m) > E ^t(6,,(.), 6,2W) > E 9 • 4' ^ ^ • 9 • 4^ 



9 A 1 1 3 A 1, , , 479 ^ ^ Q 

> - > 62880L > y -dribn ,bn.) > L > 

- 5 ^ 240 - 400 ^ 3 ' ^ - 400 • 960 ~ 2500 

1=1 1=1 

This also shows that the path Q is [2^^ (5 also j -weak, completing the proof of Lemma 14.61 □ 
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